DOI QR코드

DOI QR Code

Voice Activity Detection Based on Signal Energy and Entropy-difference in Noisy Environments

엔트로피 차와 신호의 에너지에 기반한 잡음환경에서의 음성검출

  • 하동경 (한양대학교 컴퓨터.제어.전자통신공학부) ;
  • 조석제 (한양대학교) ;
  • 진강규 (한양대학교 컴퓨터.제어.전자통신공학부) ;
  • 신옥근 (한양대학교 컴퓨터.제어.전자통신공학부)
  • Published : 2008.07.31

Abstract

In many areas of speech signal processing such as automatic speech recognition and packet based voice communication technique, VAD (voice activity detection) plays an important role in the performance of the overall system. In this paper, we present a new feature parameter for VAD which is the product of energy of the signal and the difference of two types of entropies. For this end, we first define a Mel filter-bank based entropy and calculate its difference from the conventional entropy in frequency domain. The difference is then multiplied by the spectral energy of the signal to yield the final feature parameter which we call PEED (product of energy and entropy difference). Through experiments. we could verify that the proposed VAD parameter is more efficient than the conventional spectral entropy based parameter in various SNRs and noisy environments.

Keywords

References

  1. L. Rabiner, B. H. Juang, 'Fundmentals of speech recognition', Prentice Hall, 1993
  2. J. Ramírez, J. C. Segura, C. Benítez, A. de laTorre and A. Rubio, 'An effective subband OSF-based VAD with noise reduction for robust speech recognition,' IEEE Trans. on Speech and Audio Processing, Vol.13, No.6, pp.11191129, Nov. 2005 https://doi.org/10.1109/TSA.2005.853212
  3. Gemello, R, Mana, F, De Mori, R, 'A modified Ephraim-Malah noise suppression rule for automatic speech recognition', Proc. ICASSP 2004, Vol. 1. pp. 957-960, 2004
  4. P. Renevey and A. Drygajlo, 'Entropy based voice activity detection in very noisy conditions', Proc. of Eurospeech, pp.18871890, 2001
  5. B. F. Wu and K. C. Wang, 'A Noise Estimator with Rapid Adaptation in Variable-Level Noisy Environments,' Proc. of R.O.C. Computational Linguistics Conference, ROCLING XVI, 2-3, Sep. Taipei, Taiwan, pp.33-38, 2004
  6. J. M. Górriz, J. Ramírez, C. G. Puntonet and J. C. Segura, 'An Efficient Bispectrum Phase Entropy- based Algorithm for VAD,' Interspeech 2006-ICSLP, Pittsburgh, Pennsylvania, USA, 17-19, Sep. 2006
  7. R. R. Venkatesha Prasad, R. Muralishankar, Vijay S., H. N. Shankar, P. Pawelczak and I. G. M. M. Niemegeers, 'Voice Activity Detection for VoIP-An Information Theoretic Approach,' in Proc. 49th IEEE Global Telecommunications Conference (IEEE GLOBECOM 2006), San Francisco, CA, USA, 27 Nov. - 1 Dec. 2006
  8. Shannon, C. E., 'A mathematical theory of communication,' Bell System Technical Journal, vol.27, pp.379423, 623-656, Oct. 1948 https://doi.org/10.1002/j.1538-7305.1948.tb00917.x
  9. S. V. Gerven and F. Xie, 'A comparative study of speech detection methods,' Eurospeech, pp.1095-1098, 1997
  10. B. F. Wu and K. C. Wang, 'Voice Activity Detection Based on Auto Correlation Function Using Wavelet Transform,' Computational Linguistics and Chinese Language Processing, Vol.11, No.1, pp.87-100, March 2006
  11. S. S. Stevens and J. Volkmann, 'A Scale for the Measurement of the Psychological Magnitude Pitch,' The Journal of the Acoustical Society of America, Vol.8, Issue 3, pp.185-190, Jan. 1937 https://doi.org/10.1121/1.1915893

Cited by

  1. Voice Activity Detection Algorithm using Wavelet Band Entropy Ensemble Analysis in Car Noisy Environments vol.16, pp.9, 2013, https://doi.org/10.9717/kmms.2013.16.9.1005
  2. Voice Activity Detection Algorithm using Fuzzy Membership Shifted C-means Clustering in Low SNR Environment vol.17, pp.3, 2014, https://doi.org/10.9717/kmms.2014.17.3.312