Extensibles

Annotations

class koogu.data.annotations.BaseAnnotationReader(fetch_frequencies=False)

Base class for reading annotations from storage.

Within Koogu, the method __call__() (and others in chain) will be invoked from parallel threads of execution. Exercise caution if an implementation of this class needs to use and alter any member variables.

Parameters:: fetch_frequencies – (boolean; default: False) If True, will also attempt to read annotations’ frequency bounds. NaNs will be returned for any missing values. If False, the respective item in the tuple returned from __call__() will be set to None.

__call__(source, **kwargs)

Read annotations from file/database/etc., process as appropriate and return a 5-element tuple (see below).

Parameters:

source – Identifier of an annotation source (e.g., path to an annotation file).

Returns:

A 5-element tuple

N-length list of 2-element tuples denoting annotations’ start and end times
Either None or an N-length list of 2-element tuples denoting annotations’ frequency bounds
N-length list of tags/class labels
N-length list of channel indices (0-based)
(optional; set to None if not returning) N-length list of audio sources corresponding to the returned annotations

abstract _fetch(source, **kwargs)

Read annotations from file/database/etc., process as appropriate and return a 5-element tuple (see below).

Parameters:: source – Identifier of an annotation source (e.g., path to an annotation file).

Implementations must return a 5-element tuple -

N-length list of 2-element tuples denoting annotations’ start and end times
None if load_frequencies=False, otherwise an N-length list of 2-element tuples denoting annotations’ frequency bounds
N-length list of tags/class labels
N-length list of channel indices (0-based)
(optional; set to None if not returning) N-length list of audio sources corresponding to the returned annotations

class koogu.data.annotations.BaseAnnotationWriter(write_frequencies=False)

Base class for writing annotations/detections to storage.

Within Koogu, the method __call__() (and others in chain) will be invoked from parallel threads of execution. Exercise caution if an implementation of this class needs to use and alter any member variables.

Parameters:: write_frequencies – (boolean; default: False) If True, will also write out annotations’ frequency bounds. Based on the implementation appropriate defaults (blank spaces, NaNs, negative values, etc.) will be written when missing frequency values. If False, frequency values, even if provided, will not be written out, and relevant structural constructs will not be created in the output file.

__call__(destination, times, labels, *args, **kwargs)

Write out annotations/detections to destination.

Parameters:

destination – Identifier of the target where annotations/detections will be written to (e.g., path to an annotation file).
times – An N-length list of 2-element list/tuple of start and end times.
labels – An N-length list of annotation/detection labels.
frequencies – An N-length list of 2-element list/tuple of low and high frequencies.

Returns:

Number of annotations/detections written.

Feeder

class koogu.data.feeder.BaseFeeder(data_shape, num_training_samples, num_validation_samples, class_names, **kwargs)

Base class defining the interface for implementing feeder classes for building data pipelines in Koogu.

Parameters:

data_shape – Shape of the input samples presented to the model.
num_training_samples – List of per-class counts of training samples available.
num_validation_samples – List of per-class counts of validation samples available.
class_names – List of names (str) corresponding to the different classes in the problem space.

get_shape_transformation_info()

Override in inherited class if its transform() alters the shape of the read/input data before a dataset is returned. If not None, must return a tuple where:

first value is the untransformed input shape,
second is the actual transformation function.

abstract make_dataset(is_training, batch_size, **kwargs)

This function must be implemented in the derived class.

It should contain logic to load training & validation data (usually from stored files) and construct a TensorFlow Dataset.

Parameters:

is_training – (boolean) True if operating in training mode.
batch_size – (integer) Number of input samples from the dataset to combine in a single batch.

Returns:

A tf.data.Dataset

abstract post_transform(sample, label, is_training, **kwargs)

Implement this method in the derived class to apply any post-transformation augmentations to a single input to the model (during training and validation).

Parameters:

sample – The transformed sample to which to apply augmentations.
label – The class info pertaining to sample.
is_training – (boolean) True if operating in training mode.
kwargs – Any additional parameters.

Returns:

A 2-tuple containing transformed sample and label.

abstract pre_transform(sample, label, is_training, **kwargs)

Implement this method in the derived class to apply any pre-transformation augmentations to a single input to the model (during training and validation).

Parameters:

sample – The untransformed sample to which to apply augmentations.
label – The class info pertaining to sample.
is_training – (boolean) True if operating in training mode.
kwargs – Any additional parameters.

Returns:

A 2-tuple containing transformed sample and label.

abstract transform(sample, label, is_training, **kwargs)

This function must be implemented in the derived class.

It should contain logic to apply any transformations to a single input to the model (during training and validation).

Parameters:

sample – The sample that must be ‘transformed’ before consumption by a model.
label – The class info pertaining to sample.
is_training – (boolean) True if operating in training mode.
kwargs – Any additional parameters.

Returns:

A 2-tuple containing transformed sample and label.

property class_names: List of names (str) of the classes in the application.

property data_shape: The shape of an input sample.

property num_classes: The number of classes in the application.

property training_samples: List of per-class training samples available.

property training_samples_per_class: List of per-class validation samples available.

property validation_samples: Total number of training samples available.

property validation_samples_per_class: Total number of validation samples available.

Model architecture

class koogu.model.architectures.BaseArchitecture(multilabel=True, dtype=None, name=None)

Base class for implementing custom user-defined architectures.

Parameters:

multilabel – (bool; default: True) Set appropriately so that the loss function and accuracy metrics can be chosen correctly. A multilabel model’s Logits (final) layer will have Sigmoid activation whereas a single-label model’s will have SoftMax activation.
dtype – Tensorflow data type of the model’s weights (default: tf.float32).
name – Name of the model.

abstract build_network(input_tensor, is_training, **kwargs)

This method must be implemented in the derived class.

It should contain logic to construct the desired sequential or functional model network starting from the input_tensor.

Note

Do not add the Logits layer in your implementation. It will be added by internal code.

Parameters:

input_tensor – The Keras tensor to use as the input placeholder in model that will be built.
is_training – (boolean) Indicates if operating in training mode. Certain elements of the network (e.g., dropout layers) may be excluded when not in training mode.

Returns:

Must return a Keras tensor corresponding to outputs of the architecture.

Assessment metric

class koogu.utils.assessments.BaseMetric(audio_annot_list, raw_results_root, annots_root, annotation_reader=None, reject_classes=None, remap_labels_dict=None, negative_class_label=None, **kwargs)

Base class for implementing performance assessment logic.

Parameters:

audio_annot_list – A list containing pairs (tuples or sub-lists) of relative paths to audio files and the corresponding annotation files. Alternatively, you could also specify (path to) a 2-column csv file containing these pairs of entries (in the same order). Only use the csv option if the paths are simple (i.e., the filenames do not contain commas or other special characters).
raw_results_root – The full paths of the raw result container files whose filenames will be derived from the audio files listed in audio_annot_list will be resolved using this as base directory.
annots_root – The full paths of annotations files listed in audio_annot_list will be resolved using this as base directory.
annotation_reader – If not None, must be an annotation reader instance from annotations. Defaults to Raven Reader.
reject_classes – Name (case sensitive) of the class (like ‘Noise’ or ‘Other’) for which performance assessments are not to be computed. Can specify multiple classes for rejection, as a list.
remap_labels_dict – If not None, must be a Python dictionary describing mapping of class labels. For details, see similarly named parameter to the constructor of koogu.utils.detections.LabelHelper.
negative_class_label – A string (e.g. ‘Other’, ‘Noise’) which will be used as a label to identify the negative class clips (those that did not match any annotations), if an inherited class deals with those. If specified, will be used in conjunction with remap_labels_dict.

assess(**kwargs): Perform the desired assessments.