Home


A Simple Prior for Audio Signals


Manuscript

"A Simple Prior for Audio Signals", İ. Bayram, M. Kamasak, August, 2012. [.pdf]


Software

The file SimplePriorToolbox.zip contains Matlab implementations of the experiments in the manuscript.


Description

The manuscript proposes a function that assumes low values, for audio or oscillatory signals whose frequency content varies smoothly over time. This property allows the proposed function to be used as a 'signal prior' in problems that involve audio or other oscillatory signals.

The prior is computed as follows.

(i) We first compute the time-frequency samples of the signal, which can be realized by computing the STFT (or possibly some constant-Q transform)

(ii) To the time-frequency samples, we apply a phase corrected difference operator where the difference is computed along each subband.

(iii) We then sum the absolute values of the resulting phase corrected difference image.

In step (ii), the phase corrected difference operator depends on the parameters of the STFT (or the wavelet transform), but is fixed otherwise - i.e. it does not depend on the signal.

Experiments

Below are some results from the manuscript.

Experiment 1 (Denoising)

Clean Signal, Noisy Observation (SNR = 17.59 dB)

Denoised Signals Using
1)
The Proposed Prior (SNR = 24.95 dB), 2) The Block-Thresholder from [*] (SNR = 23.90 dB).

[*] G. Yu, S. Mallat and E. Bacry, "Audio denoising by time-frequency block thresholding", IEEE TSP, 56(5) : 1830-1839, May 2008.

Experiment 2 (Denoising a Glockenspiel)

Original Signal, 'Denoised' Signal, Residual.

This experiment demonstrates that (denoising with) the prior favors tonal components compared to transient components. Please notice that here our aim is not to decompose the audio signal into its tonal and transient components. In fact, this experiment points to a shortcoming of the prior -- although transient signals like attacks, clicks etc. are audio signals of potential interest, they are not handled well by the proposed prior.

Experiment 3 (Missing Segment Recovery)

Observation with Missing Segments, Reconstruction.

Further Denoising Experiments
Speech

The results below do not appear in the manuscript.

Noisy Speech (SNR = 5 dB), Denoised Speech (SNR = 11.92 dB)

This result is not the best in terms of SNR but is more pleasing perceptually. The ones with a higher SNR (obtained with a smaller lambda -- see the manuscript) are richer in terms of the content in the high frequency regions but they contain musical noise.

Violin Vibrato

Noisy Violin Vibrato (SNR = 5 dB), Denoised Vibrato (SNR = 15.54 dB)

Violin vibrato is an example of a signal that is not expected to be well handled by the prior, since its frequency varies rapidly. However, it turns out that the frequency variation can be handled in practice. Nevertheless, one can hear the musical noise in this example.