Transfer learning
Bioacoustics researchers often tend to employ transfer learning approaches, using models that were pre-trained on other data (e.g., images). Such approaches are sometimes useful when training datasets are small. Using transfer learning in a Koogu workflow simply involves ‘plugging in’ a pre-trained model in an implementation of the abstract class koogu.model.architectures.BaseArchitecture
.
The below example implements BaseArchitecture
, by defining build_network()
to incorporate a publicly available pre-trained MobileNetV2 whose weights were trained using the large ImageNet dataset. You may choose to use a different pre-trained model from here or from other sources. Often, these pre-trained models have specific expectations with respect to input shapes, types, etc.
Assuming a data pipeline similar to the example in Quick-start guide (which uses SpectralDataFeeder
), the 1-channel spectrograms generated by the feeder must be converted to a 3-channel RGB image using a suitable colorscale (and resized) before they become inputs to the pre-trained model. We use Spec2Img
for this purpose. The pre-trained model will be used as a “feature extractor”, and a classification layer will be added. The “training” of the full model will only update weights of the added classification layer.
import tensorflow as tf
from koogu.data.tf_transformations import Spec2Img
from koogu.model.architectures import BaseArchitecture
from matplotlib import cm
class MyTransferModel(BaseArchitecture):
def build_network(self, input_tensor, is_training, **kwargs):
# Many of the available pre-trained models expect inputs to be of a
# particular size. The `input_tensor` may not already be in that shape,
# depending on the chosen data preparation parameters (e.g., with
# koogu.data.feeder.SpectralDataFeeder). We need to resize the images to
# match the input shape of the pre-trained model.
# MobileNetV2 defaults to an input size of 224x224, and also supports a
# few other sizes. 160x160 is a supported size, and we use that in this
# example.
target_img_size = (160, 160)
# Choose your favourite colorscale from matplotlib or other sources.
my_cmap = cm.get_cmap('jet')(range(256))
# `my_cmap` will be a 256 element array of RGB color values from the
# "Jet" colorscale.
# First, need to convert input spectrograms to equivalent RGB images.
# Spec2Img will convert 1-channel spectrograms to 3-channel RGB images
# (with values in the range [0.0, 1.0]) and resize them as desired.
to_image = Spec2Img(my_cmap, img_size=target_img_size)
# The pre-trained MobileNetV2 expects RGB values to be scaled to the
# range [-1.0, 1.0].
rescale = tf.keras.layers.Rescaling(2.0, offset=-1.0)
# NOTE: `Rescaling` was added in TensorFlow v2.6.0. If you use an older
# version, you can implement this operation by simply multiplying
# the output of to_image() by 2.0 and then subtracting 1.0.
# Load the pre-trained MobileNetV2 model with ImageNet weights, and
# without the trailing fully-connected layer.
pretrained_cnn = tf.keras.applications.MobileNetV2(
input_shape=target_img_size + (3, ), # Include RGB dimension
include_top=False,
weights='imagenet')
pretrained_cnn.trainable = False # Freeze CNN weights
# Pooling layer
global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
# Put them all together now.
# NOTE: The "training=False" parameter to `pretrained_cnn` is required
# to ensure that BatchNorm layers in the model operate in
# inference mode (for more details, see TensorFlow's webpage on
# transfer learning).
output = to_image(input_tensor)
output = rescale(output)
output = pretrained_cnn(output, training=False)
output = global_average_layer(output)
# NOTE: Do not add the classification layer. It will be added by Koogu's
# internal code.
return output
With an architecture defined this way, you can simply replace the following code block in step 2 of the Quick-start guide
71model = architectures.DenseNet(
72 [4, 4, 4], # 3 dense-blocks, 4 layers each
73 preproc=[ ('Conv2D', {'filters': 16}) ], # Add a 16-filter pre-conv layer
74 dense_layers=[32] # End with a 32-node dense layer
75)
… with this -
model = MyTransferModel()
That’s it! The remainder of the Koogu workflow described in the Quick-start guide will now employ transfer learning.