Annotations and Detections
- class koogu.utils.detections.LabelHelper(classes_list, remap_labels_dict=None, negative_class_label=None, fixed_labels=True, assessment_mode=False)
Provides functionality for manipulating and managing class labels in a problem space, without resorting to altering selection tables.
- Parameters:
classes_list – List of class labels. When used during data preparation, the list may be generated from available classes or be provided as a pre-defined list. When used during performance assessments, it is typically populated from the classes_list.json file that is saved alongside raw detections.
remap_labels_dict –
(default: None) If not None, must be a dictionary describing mapping of class labels. Use this to
- update existing class’ labels(e.g.
{'c1': 'new_c1'}
), - merge together existing classes(e.g.
{'c4': 'c1'}
), or - combine existing classes into new ones(e.g.
{'c4': 'new_c2', 'c23', 'new_c2'}
).
Avoid chaining of mappings (e.g.
{'c1': 'c2', 'c2': 'c3'}
).negative_class_label – (default: None) If not None, must be a string (e.g. ‘Other’, ‘Noise’) which will be used as a label to identify the negative class clips (those that did not match any annotations). If specified, will be used in conjunction with
remap_labels_dict
.fixed_labels – (bool; default: True) When True,
classes_list
will remain unchanged - any new mapping targets specified inremap_labels_dict
will not be added and any mapped-out class labels will not be omitted. Typically, it should be set to True whenclasses_list
is a pre-defined list during data preparation, and always during performance assessments.assessment_mode – (bool; default: False) Set to True when invoked during performance assessments.
See also
koogu.data.preprocess.from_selection_table_map()
koogu.data.preprocess.from_top_level_dirs()
koogu.utils.assessments.BaseMetric()
- property classes_list
The final list of class names in the problem space, after performing manipulations based on
remap_labels_dict
(if specified).
- property labels_to_indices
A Python dictionary mapping class names (string) to zero-based indices.
- property negative_class_index
Index (zero-based) of the negative class (if specified) in
classes_list
.
- koogu.utils.detections.assess_annotations_and_clips_match(clip_offsets, clip_len, num_classes, annots_times, annots_class_idxs, min_annot_overlap_fraction=1.0, keep_only_centralized_annots=False, negative_class_idx=None, max_nonmatch_overlap_fraction=0.0)
Match clips to annotations and return “coverage scores” and a mask of ‘matched annotations’. Coverage score is a value between 0.0 and 1.0 and describes how much of a particular class’ annotation(s) is/are covered by each clip.
- Parameters:
clip_offsets – M-length array of start samples (offset from the start of the audio file) of M clips.
clip_len – Number of waveform samples in each clip.
num_classes – Number of classes in the given application.
annots_times – A numpy array (shape Nx2) of start-end pairs defining annotations’ temporal extents, in terms of sample indices.
annots_class_idxs – An N-length list of zero-based indices to the class corresponding to each annotation.
min_annot_overlap_fraction – Lower threshold on how much coverage a clip must have with an annotation for the annotation to be considered “matched”.
keep_only_centralized_annots – If enabled (default: False), very short annotations (< half of
clip_len
) will generate full coverage (1.0) only if they occur within the central 50% extents of the clip or if the annotation cuts across the center of the clip. For short annotations that do not satisfy these conditions, their normally-computed coverage value will be scaled down based on the annotation’s distance from the center of the clip.negative_class_idx – If not None, clips that do have no (or small) overlap with any annotation will be marked as clips of the non-target class whose index this parameter specifies. See
max_non_match_overlap_fraction
for further control.max_nonmatch_overlap_fraction – A clip without enough overlap with any annotations will be marked as non-target class only if its overlap with any annotation is less than this amount (default 0.0). This parameter is only used when
negative_class_idx
is set.
- Returns:
A 2-element tuple containing -
MxP “coverage” matrix corresponding to the M clips and P classes. The values in the matrix will be:
1.0 - if either the m-th clip fully contained an annotation from thep-th class or vice versa (possible when annotation is longerthanclip_len
);<1.0 - if there was partial coverage (the number of overlappingsamples is divided by the shorter ofclip_len
orannotation length);0.0 - if the m-th clip had no overlap with any annotations from thep-th class.N-length boolean mask of annotations that were matched with at least one clip under the condition of
min_annot_overlap_fraction
.
- koogu.utils.detections.assess_annotations_and_detections_match(num_classes, gt_times, gt_labels, det_times, det_labels, min_gt_coverage=0.5, min_det_usage=0.5)
Match elements describing time-spans from two collections. Typically, one collection corresponds to ground-truth (gt) temporal extents and the other collection corresponds to detection (det) temporal extents.
- Parameters:
num_classes – Number of classes of the various time-events.
gt_times – Mx2 numpy array representing the start-end times of M ground-truth events.
gt_labels – M-length integer array indicating the class of each of the M ground-truth events.
det_times – Nx2 numpy array representing the start-end times of N detection events.
det_labels – N-length integer array indicating the class of each of the N detection events.
min_gt_coverage – A floating point value (in the range 0-1) indicating the minimum fraction of a ground-truth event that must be covered by one or more detections for it to be considered “recalled”.
min_det_usage – A floating point value (in the range 0-1) indicating the minimum fraction of a detection event that must have covered parts of one or more ground-truth events for it to be considered a “true positive”.
- Returns:
A 5-element tuple containing -
per-class counts of true positives
per-class counts of detections (true + false positives)
numerator for computing recall (note that given our definition of ‘true positive’ and ‘recall’, this value may not be the same as the per-class counts of true positives).
mask of ground-truth events that were “recalled”
mask of detections that were true positives
- koogu.utils.detections.postprocess_detections(clip_scores, clip_offsets, clip_length, threshold=None, suppress_nonmax=False, squeeze_min_samps=None)
Post-process detections to group together successive detections from each class.
- Parameters:
clip_scores – An [N x M] array containing M per-class scores for each of the N clips.
clip_offsets – An N-length integer array containing indices of the first sample in each clip.
clip_length – Number of waveform samples in each clip.
threshold – (default: None) If not None, scores below this value will be ignored.
suppress_nonmax – (bool; default: False) If True, will apply non-max suppression to only consider the top-scoring class for each clip.
squeeze_min_samps – (default: None) If not None, will run the algorithm to squish contiguous detections of the same class. Squeezing will be limited to produce detections that are at least this many samples long.
- Returns:
A 3-element or 4-element tuple containing -
sample indices (array of start and end pairs),
aggregated scores,
class IDs, and
if requested, start-end indices making up each combined streak.