Data augmentation

On-the-fly data augmentations can be applied during training/validation by implementing the desired augmentation operations in the pre_transform() and post_transform() methods of the classes derived from koogu.data.feeder.BaseFeeder. Given that the CNN models used in bioacoustics typically operate on inputs that are transformed into 2-dimensional spectrograms, augmentations applicable to time-domain waveforms can be implemented in pre_transform() and augmentations applicable to spectrograms can be implemented in post_transform().

Note

This requires writing code to use the TensorFlow API directly.

The below example extends koogu.data.feeder.SpectralDataFeeder by adding two augmentation operations in the time-domain and one in the spectro-temporal domain. The example also demonstrates the use of a few pre-defined & customizable augmentations. You may also add code in these methods to implement your own types of augmentation.

import tensorflow as tf
from koogu.data.augmentations import Temporal, SpectroTemporal


class MySpectralDataFeeder(koogu.data.feeder.SpectralDataFeeder):

    def pre_transform(self, clip, label, is_training, **kwargs):
        """
        Applying augmentations to waveform.
        """

        output = clip

        # Added noise will have an amplitude that is -30 dB to -18 dB below
        # the peak amplitude of the input.
        gauss_noise = Temporal.AddGaussianNoise((-30, -18))

        # Add Gaussian noise to 25% of inputs.
        output = tf.cond(tf.random.uniform([], 0, 1) <= 1 / 4,
                         lambda: gauss_noise(output),
                         lambda: output)

        # The volume of the input will be linearly lowered/increased over its
        # duration, by a factor ≤ 3 dB.
        vol_ramp = Temporal.RampVolume((-3, 3))

        # Alter volume for 10% of the inputs.
        output = tf.cond(tf.random.uniform([], 0, 1) <= 1 / 10,
                         lambda: vol_ramp(output),
                         lambda: output)

        return output, label

    def post_transform(self, spec, label, is_training, **kwargs):
        """
        Applying augmentations to power spectral density spectrogram.
        """

        output = spec

        # Smear energies along the time-axis while retaining the frequency
        # content intact.
        smear_time = SpectroTemporal.SmearTime((-2, 2))

        # Apply to one in three inputs.
        output = tf.cond(tf.random.uniform([], 0, 1) <= 1 / 3,
                         lambda: smear_time(output),
                         lambda: output)

        return output, label

The above example demonstrates finer control in implementing augmentations wherein one may employ branching/looping constructs to combine different augmentations as desired.

Convenience interface

Sometimes, you may want to simply apply a series of augmentations in a particular order, with respective chosen probabilities. The below code snippet demonstrates the use of convenience interface to apply chained augmentations. You need not use any TensorFlow API here 😀.

    def pre_transform(self, clip, label, is_training, **kwargs):
        """
        Applying augmentations to waveform.
        """

        # List of time-domain augmentations
        augmentations = [
            Temporal.AddGaussianNoise((-30, -18)),
            Temporal.RampVolume((-3, 3))
        ]

        # At what rates should each be applied (same ordering as above)
        probabilities = [
            0.25,       # apply to 1 in 4 clips
            0.10        # apply to 1 in 10 clips
        ]

        output = Temporal.apply_chain(clip, augmentations, probabilities)

        return output, label

    def post_transform(self, spec, label, is_training, **kwargs):
        """
        Applying augmentations to power spectral density spectrogram.
        """

        # List of spectrogram augmentations
        augmentations = [
            SpectroTemporal.SmearTime((-2, 2)),
            SpectroTemporal.SquishFrequency((-1, 1))
        ]

        # At what rates should each be applied (same ordering as above)
        probabilities = [
            0.33,       # apply to 1 in 3 input spectrograms
            0.20        # apply to 1 in 5 input spectrograms
        ]

        output = SpectrroTemporal.apply_chain(spec, augmentations, probabilities)

        return output, label