Acoustic fingerprinting and graphical soundscapes

Acoustic fingerprinting is a technique that captures unique features of audio signals. For example, Shazam employs a spectrogram-based approach, converting audio into a visual representation and then identifying peaks on the spectrogram [1]. This fingerprint is matched against a vast database to identify the corresponding song. The method is robust in presence of noise, allowing accurate recognition of diverse audio sources in real-time. This approach is versatile, finding application in characterizing soundscapes. It has been successfully employed to evaluate FSC forest certification [2] and Neotropical oil palm landscapes [3].

Load required modules

from maad import sound, util, rois, features

Local maxima on spectrograms

Load the audio file, get the spectrogram, and compute the local maximas.

s, fs = sound.load('../../data/spinetail.wav')
Sxx, tn, fn, ext = sound.spectrogram(s, fs, nperseg=1024, noverlap=512)
Sxx_db = util.power2dB(Sxx, db_range=80)
peak_time, peak_freq = rois.spectrogram_local_max(
    Sxx_db, tn, fn, ext, min_distance=1, threshold_abs=-40, display=True)
plot graphical soundscape

Graphical soundscapes

If we compute the local maxima over multiple audio recordings from the same site, we can have a graphical representation of the most prominent spectro-temporal dynamics over a 24-hour window. To illustrate this, we will use 96 audio recordings that were collected in a temperate forest, and that are available [here](https://github.com/scikit-maad/scikit-maad/tree/production/data/indices).

df = util.get_metadata_dir('../../data/indices')
df['time'] = df.date.dt.hour
gs = features.graphical_soundscape(
    data=df, time='time', threshold_abs=-80, target_fs=22000)
features.plot_graph(gs)
plot graphical soundscape
96 files found to process...
Processing file S4A03895_20190522_000000.wav
Processing file S4A03895_20190522_001500.wav
Processing file S4A03895_20190522_003000.wav
Processing file S4A03895_20190522_004500.wav
Processing file S4A03895_20190522_010000.wav
Processing file S4A03895_20190522_011500.wav
Processing file S4A03895_20190522_013000.wav
Processing file S4A03895_20190522_014500.wav
Processing file S4A03895_20190522_020000.wav
Processing file S4A03895_20190522_021500.wav
Processing file S4A03895_20190522_023000.wav
Processing file S4A03895_20190522_024500.wav
Processing file S4A03895_20190522_030000.wav
Processing file S4A03895_20190522_031500.wav
Processing file S4A03895_20190522_033000.wav
Processing file S4A03895_20190522_034500.wav
Processing file S4A03895_20190522_040000.wav
Processing file S4A03895_20190522_041500.wav
Processing file S4A03895_20190522_043000.wav
Processing file S4A03895_20190522_044500.wav
Processing file S4A03895_20190522_050000.wav
Processing file S4A03895_20190522_051500.wav
Processing file S4A03895_20190522_053000.wav
Processing file S4A03895_20190522_054500.wav
Processing file S4A03895_20190522_060000.wav
Processing file S4A03895_20190522_061500.wav
Processing file S4A03895_20190522_063000.wav
Processing file S4A03895_20190522_064500.wav
Processing file S4A03895_20190522_070000.wav
Processing file S4A03895_20190522_071500.wav
Processing file S4A03895_20190522_073000.wav
Processing file S4A03895_20190522_074500.wav
Processing file S4A03895_20190522_080000.wav
Processing file S4A03895_20190522_081500.wav
Processing file S4A03895_20190522_083000.wav
Processing file S4A03895_20190522_084500.wav
Processing file S4A03895_20190522_090000.wav
Processing file S4A03895_20190522_091500.wav
Processing file S4A03895_20190522_093000.wav
Processing file S4A03895_20190522_094500.wav
Processing file S4A03895_20190522_100000.wav
Processing file S4A03895_20190522_101500.wav
Processing file S4A03895_20190522_103000.wav
Processing file S4A03895_20190522_104500.wav
Processing file S4A03895_20190522_110000.wav
Processing file S4A03895_20190522_111500.wav
Processing file S4A03895_20190522_113000.wav
Processing file S4A03895_20190522_114500.wav
Processing file S4A03895_20190522_120000.wav
Processing file S4A03895_20190522_121500.wav
Processing file S4A03895_20190522_123000.wav
Processing file S4A03895_20190522_124500.wav
Processing file S4A03895_20190522_130000.wav
Processing file S4A03895_20190522_131500.wav
Processing file S4A03895_20190522_133000.wav
Processing file S4A03895_20190522_134500.wav
Processing file S4A03895_20190522_140000.wav
Processing file S4A03895_20190522_141500.wav
Processing file S4A03895_20190522_143000.wav
Processing file S4A03895_20190522_144500.wav
Processing file S4A03895_20190522_150000.wav
Processing file S4A03895_20190522_151500.wav
Processing file S4A03895_20190522_153000.wav
Processing file S4A03895_20190522_154500.wav
Processing file S4A03895_20190522_160000.wav
Processing file S4A03895_20190522_161500.wav
Processing file S4A03895_20190522_163000.wav
Processing file S4A03895_20190522_164500.wav
Processing file S4A03895_20190522_170000.wav
Processing file S4A03895_20190522_171500.wav
Processing file S4A03895_20190522_173000.wav
Processing file S4A03895_20190522_174500.wav
Processing file S4A03895_20190522_180000.wav
Processing file S4A03895_20190522_181500.wav
Processing file S4A03895_20190522_183000.wav
Processing file S4A03895_20190522_184500.wav
Processing file S4A03895_20190522_190000.wav
Processing file S4A03895_20190522_191500.wav
Processing file S4A03895_20190522_193000.wav
Processing file S4A03895_20190522_194500.wav
Processing file S4A03895_20190522_200000.wav
Processing file S4A03895_20190522_201500.wav
Processing file S4A03895_20190522_203000.wav
Processing file S4A03895_20190522_204500.wav
Processing file S4A03895_20190522_210000.wav
Processing file S4A03895_20190522_211500.wav
Processing file S4A03895_20190522_213000.wav
Processing file S4A03895_20190522_214500.wav
Processing file S4A03895_20190522_220000.wav
Processing file S4A03895_20190522_221500.wav
Processing file S4A03895_20190522_223000.wav
Processing file S4A03895_20190522_224500.wav
Processing file S4A03895_20190522_230000.wav
Processing file S4A03895_20190522_231500.wav
Processing file S4A03895_20190522_233000.wav
Processing file S4A03895_20190522_234500.wav
Computation completed!

This representation can be computed at various locations, and comparisons can be made among these sites for similarity. For further details on this approach, refer to sources [2] and [3]. It is important to note that the method implemented in scikit-maad is akin but not identical. In the original approach, peaks were calculated based on the mean spectrogram. In this version, we identify local maxima directly within the spectrogram.

References

[1] Wang, A. (2003, October). An industrial strength audio search algorithm. In Ismir (Vol. 2003, pp. 7-13).

[2] Campos‐Cerqueira, M., Mena, J. L., Tejeda‐Gómez, V., Aguilar‐Amuchastegui, N., Gutierrez, N., & Aide, T. M. (2020). How does FSC forest certification affect the acoustically active fauna in Madre de Dios, Peru?. Remote Sensing in Ecology and Conservation, 6(3), 274-285.

[3] Furumo, P. R., & Mitchell Aide, T. (2019). Using soundscapes to assess biodiversity in Neotropical oil palm landscapes. Landscape Ecology, 34, 911-923.

Total running time of the script: (0 minutes 10.367 seconds)

Gallery generated by Sphinx-Gallery