DOI QR코드

DOI QR Code

OAR Algorithm Technology Based on Opinion Mining Utilizing Stock News Contents

주식 뉴스 콘텐츠를 활용한 오피니언마이닝 기반의 OAR 감성사전 알고리즘 기법

  • 조혜진 (인천대학교 컴퓨터공학과) ;
  • 서지훈 (인천대학교 컴퓨터공학과) ;
  • 최진탁 (인천대학교 컴퓨터공학과)
  • Received : 2014.10.24
  • Accepted : 2014.12.04
  • Published : 2015.03.31

Abstract

Domestic and foreign companies have potential effects on operation of the companies and marketing of business through reputation analysis and opinion mining by applying internet text comments thanks to the current explosive increase of unstructured text data on the internet. While sensitivity analysis research of the foreign country is possible since English is suitable for customed analysis, the accuracy of text analysis of domestic counterparts is inadequate in the absence of the suitable sensitivity analysis technique for the types of Korean vocabularies. Therefore, this paper seeks the algorithm of opinion antonym rules in the consideration of Korean adversative relation based on opinion mining by applying online stock news text. Also, this proposal verifies the dictionary of sensitivity's accuracy through reputation analysis applying KOSPI price index by drawing new dictionary of sensitivity out of stock data and applying the algorithm of opinion antonym rules based on the news data related to stock.

현재 인터넷 온라인상의 비정형 텍스트 데이터의 증가로 인하여 국내외 기업들은 사용자의 인터넷 텍스트 코멘트를 활용한 평판 분석을 진행하고 있으며, 오피니언마이닝 연구를 통해 기업의 운영 및 비즈니스 마케팅에 잠재적인 효과를 미치고 있다. 국외 감성분석의 경우 고유국가의 영문 언어 규칙으로 인해 적합한 맞춤형분석이 가능한 반면 국내 감성분석의 경우에는 한국어 어휘 형태 문법에 적합한 감성분석 처리 기법의 부재로 텍스트 분석의 정확도가 미흡한 상황이다. 이에 따라 본 논문은 온라인 주식 뉴스 텍스트에 한국어 역접관계를 고려한 오피니언 반의법 규칙(OAR) 알고리즘을 제안하고자 한다. 또한 본 연구에서는 주식 관련 뉴스 데이터를 기반으로 오피니언 반의법 규칙 알고리즘을 적용하여 주식 데이터에 대한 새로운 감성사전을 도출하고, 코스피지수를 활용한 평판 분석을 통하여 감성사전에 대한 정확도를 검증한다.

Keywords

Acknowledgement

Supported by : 인천대학교

References

  1. James Manyuka, et al., "Big Data : The Next Frontier for Innovation, Competition, and Productivity", McKinsey Global Institute, pp. 1-137, May 2011.
  2. Anindya Ghose, Panagiotic G. Ipeirotis, and Arun Sundararajan, "Opinion Mining Using Econometrics : A Case Study on Reputation System", Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Vol. 45, No. 1, pp. 416-423, June 2007.
  3. Pang and L. Lee, "Opinion Mining and Sentiment Analysis", Foundation and Trends in Information Retrieval, Vol. 2, Issue. 1-2, pp. 1-135, Jan. 2008. https://doi.org/10.1561/1500000011
  4. Kerstin Denecke, "Using Sentiment SentiWordNet for multilingual sentiment analysis", IEEE 24th International Conference on Data Engineering Workshop 2008, pp. 507-512, Apr. 2008.
  5. Jung-yeon Yang, Jaeseok Myung, and Sang-goo Lee, "A Sentiment Classification Method Using Context Information in Product Review Summarization", The Korean Institute of Information Scientists and Engineers : Databse, Vol. 36, Issue. 4, pp. 254-262, Aug. 2009.
  6. Yu-Sin Kim, Nam-Kyu Kim, and Seung-Ryeol Jeong, "Stock-Index Invest Model Using News Big Data Opinion Mining", Korea Intelligent Information System Society, Vol. 18, Issue. 2, pp. 143-156, June 2012.
  7. Hanhoon Kang, Seong Jonn Yoo, and Dongil Han, "Automatic Extraction of Korean Opinion Words Usng PMI-IR and Performance Improvement Method", Proceedings of KIIS Spring Conference, Vol. 20, No. 1, pp. 318-321, Apr. 2010.
  8. Sang-il Song, Dongjoo Lee, and Sang-goo Lee, "Identifying Sentiment Polarity of Korean Vocabulary Using PMI", KCC, Vol. 37, No. 1(c), pp. 260-265, June 2010.
  9. Han-Joon Kim and Jaeyoung Chang, "Discovering News Keyword Associations Using Association Rule Mining", The Journal of the Institute of Internet Broadcasting and Communication, Vol. 11, No. 6, pp. 63-71, June 2011.
  10. Jae-Young Chang and IlMin Kim, "An Experimental Evaluation of Short Opinion Document Classification Using A Word Pattern Frequency", The Journal of the Institute of Internet Broadcasting and Communication, Vol. 12, No. 5, pp. 243-253, Oct. 2012. https://doi.org/10.7236/JIWIT.2012.12.5.243