Data transformation
Certain data transformations that are unavailable in TensorFlow/Keras are implemented as custom Keras layers in Koogu.
- class koogu.data.tf_transformations.Audio2Spectral(*args: Any, **kwargs: Any)
Layer for converting waveforms into time-frequency representations.
- Parameters:
fs – sampling frequency of the data in the last dimension of inputs.
spec_settings –
A Python dictionary describing the settings to be used for producing spectrograms. Supported keys in the dictionary include:
win_len: (required) Length of the analysis window (in seconds)
win_overlap_prc: (required) Fraction of the analysis window to have as overlap between successive analysis windows. Commonly, a 50% (or 0.50) overlap is considered.
nfft_equals_win_len: (optional; boolean) If True (default), NFFT will equal the number of samples resulting from win_len. If False, NFFT will be set to the next power of 2 that is ≥ the number of samples resulting from win_len.
tf_rep_type: (optional) A string specifying the transformation output. ‘spec’ results in a linear scale spectrogram. ‘spec_db’ (default) results in a logarithmic scale (dB) spectrogram.
eps: (default: 1e-10) A small positive quantity added to avoid computing log(0.0).
bandwidth_clip: (optional; 2-element list/tuple) If specified, the generated spectrogram will be clipped along the frequency axis to only include components in the specified bandwidth.
eps – (optional) If specified, will override the eps value in
spec_settings
.name – (optional; string) Name for the layer.
- class koogu.data.tf_transformations.GaussianBlur(*args: Any, **kwargs: Any)
Layer for applying Gaussian blur to time-frequency (tf) representations.
- Parameters:
sigma – Scalar value defining the Gaussian kernel.
apply_2d – (boolean; default: True) If True, will apply smoothing along both time- and frequency axes. Otherwise, smoothing is only applied along the frequency axis.
- class koogu.data.tf_transformations.Linear2dB(*args: Any, **kwargs: Any)
Layer for converting time-frequency (tf) representations from linear to decibel scale.
- Parameters:
eps – Epsilon value to add, for avoiding computing log(0.0).
full_scale – (boolean) Whether to convert to dB full-scale.
name – (optional; string) Name for the layer.