Performance Improvement of Speech Recognition Using Context and Usage Pattern Information

Song, Won-Moon;Kim, Myung-Won;

doi:10.3745/KIPSTB.2006.13B.5.553

The KIPS Transactions:PartB (정보처리학회논문지B)

Volume 13B Issue 5 Serial No. 108
/
Pages.553-560
/
2006
/
1598-284X(pISSN)

Korea Information Processing Society (한국정보처리학회)

DOI QR Code

Performance Improvement of Speech Recognition Using Context and Usage Pattern Information

문맥 및 사용 패턴 정보를 이용한 음성인식의 성능 개선

송원문 (숭실대학교 대학원 컴퓨터학과) ;
김명원 (숭실대학교 컴퓨터학부)

Published : 2006.10.30

https://doi.org/10.3745/KIPSTB.2006.13B.5.553 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Speech recognition has recently been investigated to produce more reliable recognition results in a noisy environment, by integrating diverse sources of information into the result derivation-level or producing new results through post-processing the prior recognition results. In this paper we propose a method which uses the user's usage patterns and the context information in speech command recognition for personal mobile devices to improve the recognition accuracy in a noisy environment. Sequential usage (or speech) patterns prior to the current command spoken are used to adjust the base recognition results. For the context information, we use the relevance between the current function of the device in use and the spoken command. Our experiment results show that the proposed method achieves about 50% of error correction rate over the base recognition system. It demonstrates the feasibility of the proposed method.

최근 음성인식에서는 잡음환경에서 좀 더 신뢰성 있는 결과를 얻기 위해 인식 결과 도출 단계에서 여러 가지 정보의 내용들을 융합하거나 이전 인식 결과의 후처리를 통하여 성능을 향상시키는 방법들이 연구되고 있다. 본 논문에서는 잡음 환경에서의 인식률 하락을 보완하기 위해 개인 모바일 기기를 위한 음성 명령어 인식에서 사용자의 사용패턴과 문맥 정보를 사용하는 방법을 제안한다. 기본 인식 결과를 보정하기 위해서 현재 명령어를 발화하기 이전에 사용자가 사용한 순차적 명령어 패턴을 사용하였다. 또한 문맥 정보를 위해서는 사용중인 기기의 현재 기능과 발화된 명령어간의 연관성을 사용하였다. 실험을 통해 제안한 방법이 기본 인식 시스템에서 발생한 오인식의 약 50%를 수정하였음을 보였으며 이로써 제안한 방법의 타당성을 검증하였다.

Keywords

References

M. Ostendorf, 'From HMM's to segment models: a unified view of stochastic modeling for speech recognition,' Speech and Audio Processing, IEEE, Vol.4, pp.360-378, 1996 https://doi.org/10.1109/89.536930
L. Rabiner, 'A tutorial on hidden Markov models and selected applications in speech recognition,' Proceedings of the IEEE, Vol.77, No.2, pp.257-286, 1989 https://doi.org/10.1109/5.18626
Satoshi Kaki, Eiichiro Sumita, and Hitoshi Iida, 'A method for correcting speech recognition using the statistical features of character co-occurrence,' International Conference On Computational Linguistics, Vol.1, pp.653-657, 1998 https://doi.org/10.3115/980451.980954
Minwoo Jeong, Byeongchang Kim, Lee, G.G., 'Semantic-oriented error correction for spoken query prooessing,' Automatic Speech Recognition and Understanding, IEEE, pp.156-161, 2003
Myung Won Kim, Joung Woo Ryu, Eun Ju Kim, 'Speech recognition by integrating audio, visual and contextual feature based on neural networks,' International Conference on Natural Computation, LNCS 3614, pp.155-164, 2005 https://doi.org/10.1007/11539117
J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal and MC. Hsu, 'PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth,' International Conference on Data Engineering, pp.215-224, 2001
Jiawei Han, Micheline Kamber, 'Data mining: concepts and techniques', Morgan Kaufmann Publishers, Academic Press, 2001
Richard J. Roiger, Michael W. Geatz, 'Data mining: a tutorial-based primer', Addison Wesley, Peardon Education, Inc., 2003
Yi Ung, Xue Li, 'Titre weight collaborated filtering,' Proceedings of the 14th ACM international Conference on Information and Knowledge Management, pp.485-492, 2005
C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu. 'A framework for projected clustering of high dimensional data streams,' Conference on Very Large Data Bases, pp.852-863, 2004
Steve Young, et., 'The HTK book (Version 3.1)', Cambridge University Engineering Department, 2001
HTK Speech Recognition Toolkit, http://htk.eng.cam.ac.uk/, Cambridge University Engineering Department
도영아, 김종수, 류저우, 김명원, '협력적 추천을 위한 사용자와 항목 모델의 효율적인 통합 방법', 한국정보과학회 논문지:소프트웨어 및 응용, Vol.30, No.6, pp.542-549, 2003

The KIPS Transactions:PartB (정보처리학회논문지B)

Performance Improvement of Speech Recognition Using Context and Usage Pattern Information

문맥 및 사용 패턴 정보를 이용한 음성인식의 성능 개선

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)