DOI QR코드

DOI QR Code

An Opinionated Document Retrieval System based on Hybrid Method

혼합 방식에 기반한 의견 문서 검색 시스템

  • 이승욱 (고려대학교 정보통신대학원 컴퓨터.전파통신공학과) ;
  • 송영인 (고려대학교 정보통신대학원 컴퓨터학과) ;
  • 임해창 (고려대학교 정보통신대학원 컴퓨터.전파통신공학과)
  • Published : 2008.12.31

Abstract

Recently, as its growth and popularization, the Web is changed into the place where people express, share and debate their opinions rather than the space of information seeking. Accordingly, the needs for searching opinions expressed in the Web are also increasing. However, it is difficult to meet these needs by using a classical information retrieval system that only concerns the relevance between the user's query and documents. Instead, a more advanced system that captures subjective information through documents is required. The proposed system effectively retrieves opinionated documents by utilizing an existing information retrieval system. This paper proposes a kind of hybrid method which can utilize both a dictionary-based opinion analysis technique and a machine learning based opinion analysis technique. Experimental results show that the proposed method is effective in improving the performance.

최근 웹 환경이 대중화되고 개방됨에 따라 웹은 단순한 정보 획득의 공간이 아닌, 의견 표출과 교환의 장이 되어 가고 있으며, 이에 따라 웹 상에서 표출된 특정 주제에 대한 사람들의 의견을 자동으로 검색하기 위한 기술 개발의 필요성이 점차 증대되고 있다. 이러한 의견 문서 검색 문제는 사용자 질의와 문서간의 적합성만을 고려하는 일반적인 정보검색 방법으로는 해결하기 어려우며, 문서 내 의견 포함 여부 분석을 수행할 수 있는 더욱 진보된 시스템을 필요로 한다. 본 논문에서는 기존 검색 시스템의 구조 하에서, 의견 문서 검색을 효과적으로 수행할 수 있는 시스템을 제안한다. 의견 검색을 수행하기 위해 문서내 의견 분석 방법에 대해 기존의 사전 기반 방식과 기계학습 기반 방식을 결합한 새로운 혼합 방식을 제안하고, 실험을 통하여 검색 성능을 개선하는 효과가 있음을 보였다.

Keywords

References

  1. Attardi, G., and M. Simi. 2006. “Blog Mining through Opinionated Words." Proceedings of the 15th TREC
  2. Clark, M., U. C. Beresi, S. Watt, and D. Harper. 2006. “RGU at the TREC Blog Track." Proceedings of the 15th TREC
  3. Dave, K., S. Lawrence, and D. M. Pennock. 2003. “Mining the Peanut Gallery: Opinion Extraction and Semantic Classification for Product Reviews." WWW https://doi.org/10.1145/775152.775226
  4. Hannah, D., C. Macdonald, J. Peng, B. He, and I. Ounis. 2007. “University of Glasgow at TREC 2007: Experiments in Blog and Enterprise Tracks with Terrier." Proceedings of the 16th TREC
  5. Java, A., P. Kolari, T. Finin, A. Joshi, and J. Martineau. 2006. “The BlogVox Opinion Retrieval System." Proceedings of the 15th TREC
  6. Joshi, H., C. Bayrak, and X. Xu. 2006. “UALR at TREC: Blog Track." Proceedings of the 15th TREC
  7. Kim, S. M., and E. Hovy. 2004. “Determining the Sentiment of Opinions." Proceedings of Conference on Computational Linguistics (COLLING-04)
  8. Liao, X., D. Cao, S. Tan, Y. Liu, G. Ding, and X. Cheng. 2006. “Combining Language Model with Sentiment Analysis for Opinion Retrieval of Blog-Post." Proceedings of the 15th TREC
  9. Macdonald, C. and I. Ounis. 2006. “The TREC Blog06 Collection : Creating and Analysing a Blog Test Collection." DCS Technical Report TR-2006-224. Department of Computing Science, University of Glasgow
  10. Miller, G. A. 1992. “WordNet: A Lexical Database for English." Proceedings of the workshop on Speech and Natural Language, ACL
  11. Oard, D., T. Elsayed, J. Wang, and Y. Wu. 2006. “TREC-2006 at Maryland: Blog, Enterprise, Legal and QA Tracks." Proceedings of the 15th TREC
  12. Ounis, I., M. D. Rijke, C. Macdonald G. Mishne, and I. Soboroff. 2006. “Overview of the TREC-2006 Blog Track." Proceedings of the 15th TREC, pp.17-31
  13. Ponte, J. M. and W. B. Croft. 1998. “A Language Modeling Approach to Information Retrieval." Proceedings of the 21st Annual international ACM SIGIR Conference on Research and Development in information Retrieval (SIGIR '98)
  14. Stone, P. J. and D. C. Dunphy. 1966. A Computer Approach to Content Analysis. MIT Press
  15. Vechtomova, O. 2007. “Using Subjective Adjectives in Opinion Retrieval from Blogs." Proceedings of the 16th TREC
  16. Winson, T., D. R. Pierce, and J. Wiebe. 2003. “Identifying Opinionated Sentences." ACL https://doi.org/10.3115/1073427.1073444
  17. Yang, H., L. Si, and J. Callan. 2006. “Knowledge Transfer and Opinion Detection in the TREC2006 Blog Track." Proceedings of the 15th TREC
  18. Yang, K., N. Yu, A. Valerio, and H. Zhang. 2006. “WIDIT in TREC-2006 Blog track." Proceedings of the 15th TREC
  19. Zhang, E. and Y. Zhang. 2006. “UCSC on TREC 2006 Blog Opinion Mining." Proceedings of the 15th TREC
  20. Zhang, M. and X. Ye. 2008. “A Generation Model to Unify Topic Relevance and Lexicon-based Sentiment for Opinion Retrieval." Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR
  21. Zhang, W. and C. Yu. 2006. “UIC at TREC 2006 Blog Track." Proceedings of the 15th TREC
  22. Zhou, G. X., H. Joshi, and C. Bayrak. 2007. “Topic Categorization for Relevancy and Opinion Detection." Proceedings of the 16th TREC

Cited by

  1. Experimental Study for Effective Combination of Opinion Features vol.27, pp.3, 2010, https://doi.org/10.3743/KOSIM.2010.27.3.227
  2. Development of Korean Opinion Analysis System using Semantic Dictionary and Inverse Opinion Processing vol.11, pp.8, 2010, https://doi.org/10.5762/KAIS.2010.11.8.3070