intro algorithm sounds

Single-channel noise reduction - algorithm details

As they use just one microphone, single-channel noise reduction schemes cannot exploit spatial information to reduce the background noise with respect to the desired speech. This type of speech enhancement techniques therefore have to rely on differences in statistical properties of speech and noise. For example, depending on the signal-to-noise ratio (SNR), large spectral magnitudes are more likely to originate from a speech process than from a typical noise process. Also in many cases, the disturbing process is more stationary than the speech. Based on these assumptions, a number of speech estimators can be derived using different kinds of optimization criterions, like the minimum mean square error (MMSE). Typically, the estimators are implemented in the spectral domain.
In the HearCom project several single-channel noise reduction schemes have been considered. One solution is based on a filter that results from the MMSE optimization criterion assuming that the real and the imaginary parts of the speech spectral coefficients are Laplacian distributed and that the noise spectral coefficients are Gaussian distributed. The implementation has a low algorithmic delay and uses Discrete Fourier transform (DFT) techniques. Noise power estimation is based on the Minimum Statistics noise power estimator.

References:

Martin, R. (2002). Speech Enhancement Using MMSE Short Time Spectral Estimation with Gamma Distributed Speech Priors. In Proc. IEEE Intl. Conference on Acoustics, Speech, and Signal Processing (ICASSP), volume I, pages 253–256, Orlando, Florida.

Martin, R. and Breithaupt, C. (2003). Speech Enhancement in the DFT Domain Using Laplacian Speech Priors. In Proc. Intl. Workshop Acoustic Echo and Noise Control (IWAENC), pages 87–90, Kyoto, Japan.

Mauler, D. (2006). Noise Power Spectral Density Estimation on Highly Correlated Data. In Proc. Intl. Workshop Acoustic Echo and Noise Control (IWAENC), Paris, France.