Spectral Subtraction Using Spectral Harmonics for Robust Speech Recognition in Car Environments

  • Beh, Jounghoon (Dept. of Electronics and Computer Engineering, Korea University) ;
  • Ko, Hanseok (Dept. of Electronics and Computer Engineering, Korea University)
  • Published : 2003.06.01

Abstract

This paper addresses a novel noise-compensation scheme to solve the mismatch problem between training and testing condition for the automatic speech recognition (ASR) system, specifically in car environment. The conventional spectral subtraction schemes rely on the signal-to-noise ratio (SNR) such that attenuation is imposed on that part of the spectrum that appears to have low SNR, and accentuation is made on that part of high SNR. However, these schemes are based on the postulation that the power spectrum of noise is in general at the lower level in magnitude than that of speech. Therefore, while such postulation is adequate for high SNR environment, it is grossly inadequate for low SNR scenarios such as that of car environment. This paper proposes an efficient spectral subtraction scheme focused specifically to low SNR noisy environment by extracting harmonics distinctively in speech spectrum. Representative experiments confirm the superior performance of the proposed method over conventional methods. The experiments are conducted using car noise-corrupted utterances of Aurora2 corpus.

Keywords

References

  1. J. Jensen, and J. Hansen, 'Speech enhancement using a constrained iterative sinusoidal model,' IEEE Transactions on Speech and Audio Processing, 9 (7), 731-740, 2001 https://doi.org/10.1109/89.952491
  2. D. Ealey, H. Kellher, and D. Pearce, 'Harmonic tunneling:t rack-ing non-stationary noises during speech,' Eurospeech, 437-440, 2001
  3. N. Virag, 'Single channel speech enhancement based on masking properties of the human auditory system,' IEEE Transactions on Speech and Audio Processing, 7(2), 126-137, 1999 https://doi.org/10.1109/89.748118
  4. P. Lockwood, and J. Boudy, 'Experiments with a Nonlinear Spectral Subtractor (NSS), hidden markov models and the projection, for robust speech recognition in cars,' Speech Communication, 11, 215-228, 1992 https://doi.org/10.1016/0167-6393(92)90016-Z
  5. W. Hess, Pitch Determination of Speech Signals, Springer Verlag, 1983
  6. L. Rabiner, and R. Schafer, Digital Processing of Speech Signals, Prentice-Hall, 1978
  7. M. Berouti, R. Schwartz, and J. Makhoul, 'Enhancement of speech corrupted by additive noise,' Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing, 208-211, 1979
  8. S. F. Boll, 'Suppression of acoustic noise in speech using spectral subtraction,' IEEE Transaction on Acoustics, Speech and Signal Processing, 27(2), 113-120, 1979 https://doi.org/10.1109/TASSP.1979.1163209