Sound processing

The module sound is an ensemble of functions to load and preprocess audio signals.

Input and output

load(filename[, channel, detrend, verbose, ...])

Load an audio file (stereo or mono).


Download audio file from the web and load it as a variable.

load_spectrogram(filename, fs, duration[, ...])

Load an image from a file or an URL

write(filename, fs, data[, bit_depth])

Write a NumPy array as a WAV file with the Scipy method.

Preprocess audio

fir_filter(x, kernel[, axis])

Filter a signal using a 1d finite impulse response filter.

sinc(s, cutoff, fs[, atten, transition_bw, ...])

Filter 1D signal with a Kaiser-windowed filter.

smooth(Sxx[, std, verbose, display, savefig])

Smooth a spectrogram with a gaussian filter.

select_bandwidth(x, fs, fcut, forder[, ...])

Use a lowpass, highpass, bandpass or bandstop filter to process a 1d signal with an iir filter.

pcen(Sxx[, gain, bias, power, b, eps, ...])

Per-Channel Energy Normalization (PCEN)

remove_background(Sxx[, gauss_win, ...])

Remove background noise using spectral subtraction.

remove_background_morpho(Sxx[, q, display, ...])

Remove background noise in a spectrogram using mathematical morphology tool.

remove_background_along_axis(Sxx[, mode, ...])

Get the noisy profile along the defined axis and remove this profile from the spectrogram.

median_equalizer(Sxx[, display, savefig])

Remove background noise in spectrogram using median equalizer.

wave2frames(s[, Nt])

Reshape a sound waveform (ie vector) into a serie of frames (ie matrix) of length Nt.

Transform audio

spectrogram(x, fs[, window, nperseg, ...])

Compute a spectrogram using the short-time Fourier transform from an audio signal.


Computes the average of a power spectrogram along the time axis.


Computes the average of an amplitude spectrogram along the time axis.

linear_to_octave(X, fn[, thirdOctave, display])

Transform a linear spectrum (1d) or Spectrogram (2d into octave or 1/3 octave spectrum (1d) or Spectrogram (2d).

envelope(s[, mode, Nt])

Calculate the envelope of a sound waveform (1d)

spectrum(s, fs[, nperseg, noverlap, nfft, ...])

Estimate the power spectral density or power spectrum of 1D signal.

resample(s, fs, target_fs[, res_type])

Changes the sample rate of an audio file or any time series.

trim(s, fs, min_t, max_t[, pad, pad_constant])

Slices a time series, from a initial time min_t to an ending time max_t.

normalize(s[, max_amp, max_db])

Normalize audio signal to desired amplitude or decibell full scale value (dBFS).

gain(s[, gain_db])

Amply amplification or attenuation to the audio signal.


temporal_snr(s[, mode, Nt])

Compute the signal to noise ratio (SNR) of an audio signal in the time domain.


Compute the signal to noise ratio (SNR) of an audio from its spectrogram in the time-frequency domain.


Compute the sharpness of a spectrogram