DOI QR코드

DOI QR Code

Word sense disambiguation using dynamic sized context and distance weighting

가변 크기 문맥과 거리가중치를 이용한 동형이의어 중의성 해소

  • Lee, Hyun Ah (Department of Computer Software Engineering, Kumoh National Institute of Technology)
  • Received : 2014.01.27
  • Accepted : 2014.05.08
  • Published : 2014.05.31

Abstract

Most researches on word sense disambiguation have used static sized context regardless of sentence patterns. This paper proposes to use dynamic sized context considering sentence patterns and distance between words for word sense disambiguation. We evaluated our system 12 words in 32,735sentences with Sejong POS and sense tagged corpus, and dynamic sized context showed 92.2% average accuracy for predicates, which is better than accuracy of static sized context.

의미 중의성 해소를 위한 대부분의 기존 연구에서는 문장의 특성에 상관없이 고정적인 크기의 문맥을 사용해 왔다. 본 논문에서는 중의성 해소에서 문장에 따라 가변적인 크기의 문맥을 사용하는 가변길이 윈도우와 단어간 거리를 사용한 의미분석 방법을 제안한다. 세종코퍼스의 형태의미분석 말뭉치로 학습하여 12단어 32,735문장에 대해 실험한 결과에서 제안된 방법이 용언에 대하여 92.2%의 평균 정확도를 보여 고정 크기의 문맥을 사용한 경우에 비해 향상된 결과를 보였다.

Keywords

References

  1. B. m. Kang, "Aspects of the use of homonyms"", Language research, vol. 41 no. 1, pp. 1-29, 2005 (in Korean).
  2. J. M. Cho, Verb Sense Disambiguation Using Corpus and Dictionary, Ph.D Thesis, KAIST, 1998 (in Korean).
  3. 21st Century Sejong Project, http://www.sejong.or.kr/, Accessed May 26, 2014.
  4. J. Hur and C. Y. Ock, "A homonym disambiguation system based on semantic information extracted from dictionary definitions", Journal of KIISE: Software and Applications, vol. 28, no. 9, 2001 (in Korean).
  5. J. S. Kim, C. H. Kim, W. W. Lee, S. D. Lee, and C. Y. Ock, "A homonym disambiguation system based on statistical model using sense category and distance weights", 13th Annual Conference of Human and Cognitive Language Technology, pp. 487-493, 2001 (in Korean).
  6. J. S. Kim, H. S. Choe, and C. Y. Ock, "A korean homonym disambiguation model based on statistics using weights", Journal of KIISE: Software and Applications, vol. 30, no. 11, pp. 1112-1123, 2003 (in Korean).
  7. S. J. Kang, "Ontology construction and its application to disambiguate word senses", The KIPS Transactions, vol. 11-B, no. 4, pp. 491-500, 2004 (in Korean). https://doi.org/10.3745/KIPSTB.2004.11B.4.491
  8. J. S. Kim and C. Y. Ock, "A korean homonym disambiguation system using refined semantic information and thesaurus", The KIPS Transactions, vol. 12, no. 7, pp. 829-840, 2005 (in Korean). https://doi.org/10.3745/KIPSTB.2005.12B.7.829
  9. M. H. Kim and H. C. Kwon, "Word sense disambiguation using semantic relations in Korean wordnet", Journal of KIISE: Software and Applications, vol. 38 no. 10, 2011, 554-564 (in Korean).
  10. Y. S. Par, J. C. Shin, C. Y. Ock, and H. R. Park, "Verb sense disambiguation using subordinating case information", The KIPS Transactions, vol. 18-B, no. 4, pp. 241-248, 2011 (in Korean). https://doi.org/10.3745/KIPSTB.2011.18B.4.241
  11. Y. G. Lee, "A study on statistical feature selection with supervised learning for word sense disambiguation", Journal of the Korean Biblia Society for Library and Information Science, vol. 25, no. 2, pp. 5-25, 2011 (in Korean).
  12. A. C. Le, A. Shimazu, V. N. Huynh, and L. M. Nguyen, "Semi-supervised learning integrated with classifier combination for word sense disambiguation", Science Direct Computer Speech and Language, vol. 22, no. 4, pp. 330-345, 2008. https://doi.org/10.1016/j.csl.2007.11.001
  13. D. F. Amoros, and R. Heradio, "Understanding the role of conceptual relations in Word Sense Disambiguation", Expert Systems with Application, vol. 38, no. 8, pp. 9506-9516, 2011. https://doi.org/10.1016/j.eswa.2011.01.150
  14. G. T. Park, T. H. Lee, S. H. Hwang, B. M. Kim, H. A. Lee, and Y. S. Shin, "Korean learning assistant system with automatically extracted knowledge", The Korea Information Processing Society Transactions on Software and Data Engineering, vol. 1, no. 2, pp. 91-102, 2012 (in Korean). https://doi.org/10.3745/KTSDE.2012.1.2.091
  15. G. T. Park, T. H. Lee, S. H. Hwang, and H. A. Lee, "Statistical word sense disambiguation based on using variant window size", The 24th Annual Conference of Human and Cognitive Language Technology, pp. 40-46, 2012 (in Korean).