DOI QR코드

DOI QR Code

A Graph-based Word Sense Disambiguation Using Measures of Graph Connectivity

그래프 연결성 측정을 사용한 그래프 기반의 어의 중의성 해소

  • 조정길 (성결대학교 컴퓨터공학부) ;
  • 신광철 (성결대학교 산업경영공학부)
  • Received : 2014.04.23
  • Accepted : 2014.06.05
  • Published : 2014.06.30

Abstract

Word Sense Disambiguation(WSD) is one of tasks in the Natural Language Processing(NLP) that uses to identifying the sense of words in context. Since Human language includes many ambiguous words. Thus, the objective of WSD is identifying the correct sense of word. This paper describes an unsupervised graph-based method for word sense disambiguation, we uses a graph connectivity structure for finding the correct senses. This algorithm has few parameters and does not require sense-annotated data for training. The results assessed the performance of algorithms on standard data sets, and showed that the best measures perform comparably to state-of-the-art. In the future, we are interested in applying the proposed method to weight graphs.

어의 중의성 해소는 문맥에서 단어의 의미를 식별하는데 사용하는 자연 언어 처리 작업의 한부분이다. 사람의 언어는 많은 모호한 단어가 포함되어 있기 때문에 어의 중의성 해소의 목적은 단어의 정확한 의미를 파악하는 것이다. 이 논문에서는 어의 중의성 해소에 대한 비감독 그래프 기반의 방법을 설명하고, 단어의 정확한 의미를 찾기 위하여 그래프 연결성 구조를 사용한다. 이 논문의 알고리즘은 몇 개의 매개변수를 사용하며, 학습에 대한 의미 주석을 단 데이터가 필요하지 않다. 결과는 표준 데이터 셋에서 알고리즘의 성능을 평가하고, 최신 기술로 만족할만한 측정을 달성한 것을 보여주었다. 향후에 우리는 제안된 방법에 가중 그래프를 적용하는 것을 시도할 계획이다.

Keywords

References

  1. J. K. Cho, "A New Word Similarity Measure Method based on WordNet", Journal of KIIT, Vol. 11, No. 7, pp. 121-129, July 2013.
  2. E. Hessami, F. Mahmoudi, and A. H. Jadidinejad, "Unsupervised Graph-based Word Sense Disambiguation Using Lexical relation of International Journal of Computer Issues(IJCSI), Vol. 8, Issue 6, No 3, pp. 225-230, Nov. 2011.
  3. R. Sinha and R. Mihalcea, "Unsupervised Graph-based Word Sense Disambiguation Using Measures of Word Semantic Similarity", In IEEE International Conference on Semantic Computing (ICSC), Feb. 2007.
  4. R. Navigli and M. Lapata, "An Experimental Study of Graph Connectivity for Unsupervised Word Sense Disambiguation", Pattern Analysis and Machine Intelligence, IEEE Transactions on, IEEE Computer Society, April 2010.
  5. E. Agirre and A. Soroa, "Personalizing pagerank for word sense disambiguation", In Proc. of EACL '09, pp. 33-41, April 2009.
  6. R. Mihalcea, "Unsupervised large-vocabulary unsupervised word sense disambiguation with graph-based algorithms for sequence data labeling", In Proceeding of HLT/EMNLP 2005 conference, Vancouver, pp. 411-418, Oct. 2005.
  7. M. Lesk, "Automated sense disambiguation using machine-readable dictionaries: How to tell a pine cone from an ice cream cone", In Proc. of the SIGDOC Conference, pp. 24-26, March 1986.
  8. S. Banerjee and T. Pedersen, "An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet", CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Proceeding, Feb. 2002.
  9. V. Florentina, P. Langlais, and G. Lapalme, "Evaluating variants of the Lesk approach for disambiguation words", Proceedings of the Conference on Language Resources and Evaluation (LREC), Lisbon, Portugal, pp. 633-636, May 2004.
  10. http://www.webconfs.com/stop-words.php, 2013.
  11. D. Yarowsky, "Unsupervised word sense disambiguation rivaling supervised method", Annual Meeting of the ACL Archive Proceedings of the 33th conference on Association for Computational Linguistics, pp. 189-196, July 1995.
  12. A. Botafogo, E. Rivlin, and B. Shneiderman, "Structural analysis of hypertexts: Identifying hierarchies and useful metrics", ACM Transactions on Information Systems, Vol. 10, No. 2, pp. 142-180, May 1992. https://doi.org/10.1145/146802.146826
  13. R. Navigli and M. Lapata, "Graph Connectivity measures for Unsupervised Word Sense Disambiguation", IJCAI-07 Proceedings of the 20th International joint conference on Artifical Intelligence, pp. 1683-1688, Feb. 2007.
  14. M. Palmer, C. Fellbaum, S. Cotton, L. Delfs, and H. Dang, "English tasks: all-words and verb lexical sample", In Proceedings of ACL/SIGLEX Senseval-2, Toulouse, France, July 2001.