Data feeder

For most common applications in bioacoustics, SpectralDataFeeder offers a convenient way to convert prepared audio clips into spectrograms on-the-fly during training/validation. DataFeeder does not apply any transformations and feeds audio clips as-is. By default, both classes do not apply any augmentations.

For processing prepared data generated by mechanisms outside of Koogu, extend koogu.data.feeder.BaseFeeder to implement custom logic to feed your data into the Koogu training pipeline. See ../../advanced/custom_feeder for guidance.

Like koogu.data.feeder.BaseFeeder, both DataFeeder and SpectralDataFeeder are extensible, and facilitate implementation of custom transformations and/or augmentations.

To apply canned augmentations (either those offered in Koogu or any user-defined augmentations implemented by subclassing koogu.data.augmentations.Temporal or koogu.data.augmentations.SpectroTemporal) in a chained manner, use methods add_pre_transform_augmentation() and add_post_transform_augmentation() on instances of DataFeeder and SpectralDataFeeder. You may invoke these methods repeatedly to build your desired chaining of canned augmentations. For finer control with applying augmentations (i.e., implementing conditional branching and other logic), create a subclass and override the methods pre_transform() and post_transform() as needed.

class koogu.data.feeder.SpectralDataFeeder(data_dir, fs, spec_settings, **kwargs)

Bases: DataFeeder

A handy data feeder, which converts prepared audio clips into power spectral density spectrograms.

Parameters:
  • data_dir – Directory under which prepared data (.npz files) are available.

  • fs – Sampling frequency of the prepared data.

  • spec_settings – A Python dictionary. For a list of possible keys and values, see parameters to Audio2Spectral.

  • normalize_clips – (optional; boolean) If True (default), input clips will be normalized before applying transform (computing spectrograms).

Other parameters applicable to the parent DataFeeder class may also be specified.

class koogu.data.feeder.DataFeeder(data_dir, validation_split=None, min_clips_per_class=None, max_clips_per_class=None, random_state_seed=None, **kwargs)

Bases: BaseFeeder

A class for loading preprocessed data from numpy .npz files and feeding them (untransformed) into the training/evaluation pipeline. To apply any transformations, create a subclass and override the method transform().

Parameters:
  • data_dir – Directory under which prepared data (.npz files) are available.

  • validation_split – (default: None) Fraction of the available data that must be held out for validation. If None, all available data will be used as training samples.

  • min_clips_per_class – (default: None) The minimum number of per-class samples that must be available. If fewer samples are available for a class, the class will be omitted. If None, no classes will be omitted.

  • max_clips_per_class – (default: None) The maximum number of per-class samples to consider among what is available, for each class. If more samples are available for any class, the specified number of samples will be randomly selected. If None, no limits will be imposed.

  • random_state_seed – (default: None) A seed (integer) used to initialize the pseudo-random number generator that makes shuffling and other randomizing operations repeatable.

  • cache – (optional; boolean) If True (default), the logic to ‘queue & batch’ training/evaluation samples (loaded from disk) will also cache the samples. Helps speed up processing.

  • suppress_nonmax – (optional; boolean) If True, the class labels will be one-hot type arrays, useful for training single-class prediction models. Otherwise (default is False), they will be suitable for training multi-class prediction models, giving values in the range 0-1 for each class.

add_post_transform_augmentation(probability, augmentation, *augmentation_args, **augmentation_kwargs)

Add a spectro-temporal augmentation to the feeder.

Parameters:
  • probability – Probability (>0 & ≤ 1) of application.

  • augmentation – Name of or reference to one of the available spectro-temporal augmentation classes, or reference to a user-implemented augmentation class.

  • augmentation_args – Positional arguments passed as-is to the augmentation class constructor.

  • augmentation_kwargs – Keyword arguments passed as-is to the augmentation class constructor.

Note

If you are subclassing DataFeeder, avoid using this function. Instead, apply the desired augmentations in the overridden implementation of post_transform() directly.

add_pre_transform_augmentation(probability, augmentation, *augmentation_args, **augmentation_kwargs)

Add a temporal augmentation to the feeder.

Parameters:
  • probability – Probability (>0 & ≤ 1) of application.

  • augmentation – Name of or reference to one of the available temporal augmentation classes, or reference to a user-implemented augmentation class.

  • augmentation_args – Positional arguments passed as-is to the augmentation class constructor.

  • augmentation_kwargs – Keyword arguments passed as-is to the augmentation class constructor.

Note

If you are subclassing DataFeeder, avoid using this function. Instead, apply the desired augmentations in the overridden implementation of pre_transform() directly.

post_transform(sample, label, is_training, **kwargs)

Applies desired post-transformation augmentations to a model input (i.e., to transformed sample, such as a spectrogram). This method is not intended to be invoked directly; it will be invoked during model training/validation. Custom pre-transformation augmentations can be implemented by subclassing DataFeeder and overriding this method with an implementation of the desired operations within the method.

Parameters:
  • sample – A transformed model input (e.g., spectrogram).

  • label – A one-hot styled label array corresponding to the sample.

  • is_training – Flag (boolean) indicating whether the method is being invoked during training (True) or validation (False).

Returns:

A 2-tuple containing augmented sample and label.

pre_transform(sample, label, is_training, **kwargs)

Applies desired pre-transformation augmentations to a model input (i.e., to audio clip). This method is not intended to be invoked directly; it will be invoked during model training/validation. Custom pre-transformation augmentations can be implemented by subclassing DataFeeder and overriding this method with an implementation of the desired operations within the method.

Parameters:
  • sample – Raw clip, as loaded from the stored .npz file.

  • label – A one-hot styled label array corresponding to the sample, as loaded form the stored .npz file.

  • is_training – Flag (boolean) indicating whether the method is being invoked during training (True) or validation (False).

Returns:

A 2-tuple containing augmented sample and label.

transform(sample, label, is_training, **kwargs)

Applies desired transformation(s) to a model input (audio clip). This method is not intended to be invoked directly; it will be invoked during model training/validation. Custom data transformations can be implemented by subclassing DataFeeder and overriding this method with an implementation of the desired operations within the method.

Parameters:
  • sample – Raw clip, as loaded from the stored .npz file.

  • label – A one-hot styled label array corresponding to the sample, as loaded form the stored .npz file.

  • is_training – Flag (boolean) indicating whether the method is being invoked during training (True) or validation (False).

Returns:

A 2-tuple containing transformed sample and label.