Quick-start guideο
This guide presents a recipe for a full bioacoustics ML workflow, from data pre-processing to training, to performance assessments, and finally, to using a trained model for analyzing soundscape/field recordings. End-to-end examples are presented in both flavors:
CLI workflow: a βno codeβ command-line interface
API workflow: a programmatic (Python) approach
Both approaches use the same underlying functionality and produce identical results. You may test them yourself, using the below sample dataset. Once, you have it working, you could modify the program/config to suit your own dataset.
Sample dataset
Both workflows use the North Atlantic Right Whale (NARW) up-call dataset from the DCLDE 2013 challenge. The dataset contains 7 days of round-the-clock recordings out of which recordings from the first 4 days were earmarked as a training set and recordings from the remaining 3 days were set aside as a test set. Each audio file is 15 minutes in duration, and files from each day are organized in day-specific subdirectories. The original dataset contained annotations in the legacy Xbat format, which we converted to RavenPro selection table format for compatibility with Koogu. A representative subset of the dataset, with converted annotations, can be accessed here.
The workflows expect the training and test audio files and corresponding annotation files to be organized in a directory structure as shown below:
π projects
ββ π NARW
ββ π data
ββ π train_audio
ββ π train_annotations
ββ π test_audio
ββ π test_annotations