DOI QR코드

DOI QR Code

A Morphological Analysis Method of Predicting Place-Event Performance by Online News Titles

온라인 뉴스 제목 분석을 통한 특정 장소 이벤트 성과 예측을 위한 형태소 분석 방법

  • Choi, Sukjae (Humanitas BigData Research Center, Kyung Hee University) ;
  • Lee, Jaewoong (Humanitas BigData Research Center, Kyung Hee University) ;
  • Kwon, Ohbyung (School of Management, Kyung Hee University)
  • Received : 2015.12.08
  • Accepted : 2016.02.13
  • Published : 2016.02.28

Abstract

Online news on the Internet, as published open data, contain facts or opinions about a specific affair and hence influences considerably on the decisions of the general publics who are interested in a particular issue. Therefore, we can predict the people's choices related with the issue by analyzing a large number of related internet news. This study aims to propose a text analysis methodto predict the outcomes of events that take place in a specific place. We used topics of the news articles because the topics contains more essential text than the news articles. Moreover, when it comes to mobile environment, people tend to rely more on the news topics before clicking into the news articles. We collected the titles of news articles and divided them into the learning and evaluation data set. Morphemes are extracted and their polarity values are identified with the learning data. Then we analyzed the sensitivity of the entire articles. As a result, the prediction success rate was 70.6% and it showed a clear difference with other analytical methods to compare. Derived prediction information will be helpful in determining the expected demand of goods when preparing the event.

공개된 데이터인 온라인 뉴스 기사 중 상당수는 도시와 같은 특정 장소에서 발생하는 이벤트에 관련된 사실과 의견을 담고 있어 독자의 의사 결정에 영향을 끼친다. 따라서 대량의 인터넷 뉴스 기사를 분석하면 향후 사람들이 특정 이벤트에 대하여 어떠한 선택을 할지 예상할 수 있을 것이다. 이에 본 연구는 온라인 뉴스 기사 제목을 형태소 분석하여 특정 장소에서 이루어질 이벤트의 성과를 사전에 예측하는 방법을 제안하고자 한다. 기사 제목은 기사의 가장 핵심적인 내용을 담고 있어 본문보다 사실과 의견이 더 정확하게 발현될 뿐 아니라, 모바일 환경에서는 기사 본문보다 더 큰 영향력을 가지기 때문에 이벤트의 성과 예측에 효과적인 자료이다. 이에 인터넷 뉴스 기사의 제목을 수집하여 학습 데이터와 평가 데이터로 구분하고, 학습 데이터에서 유의한 극성을 보이는 형태소를 추출하여 전체 기사의 제목을 감성 분석하였다. 여기에 뉴스 기사가 갖는 특성이 반영될 수 있도록 기사 검색량과 기사 산출량 정보를 변인에 추가하여 이벤트 성과를 예측하는 알고리즘을 수립하였다. 그 결과 70.6%의 성공률로 성과를 예측하여 다른 비교 대상 분석 방법과 분명한 차이를 보였다. 도출된 이벤트 성과 예측 정보는 이벤트를 준비하는 기관 및 업체에서 예상 수요량을 결정할 때 도움을 줄 수 있을 것이다.

Keywords

References

  1. Ahn, S. and Cho, S., "Stock Prediction Using News Text Mining and Time Series Analysis," Korea Computer Congress, Vol. 37, No. 1, pp. 364-369, 2010.
  2. Ahn, S. H., Lee, S. H., and Kwon, O. S., "Activation Dimension: A Mirage in the Affective Space?," Korean Psychology Association, Vol. 7, No. 1, pp. 107-123, 1993.
  3. Allport, G. W. and Odbert, H. S., "Traitnames: A psycho-lexical study," Psychological Monographs, Vol. 47, No. 1, 1936.
  4. Baccianella, S., Esuli, A., and Sebastiani, F., "SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining," In Proceedings of the 7th Conference on International Language Resources and Evaluation (LREC '10), pp. 2200-2204, 2010.
  5. Bautin, M., Vijayarenu L., and Skiena, S., "International Sentiment Analysis for News and Blogs," ICWSM, 2008.
  6. Entman, R. M., "How the Media Affect What People Think: An Information Processing Approach," The Journal of Politics, Vol. 51, No. 2, pp. 347-370, 1989. https://doi.org/10.2307/2131346
  7. Falkheimer, J., "When Place Images Collides: Place Branding and News Journalism," InGeographies of Communication: the Spatial Turn in Media Studies, Nordicom, 2006.
  8. Fehr, B. and Russell, J. A., "Concept of emotion viewed from a prototype perspective," Journal of experimental psychology: General, Vol. 113, No. 3, pp. 464-486, 1984. https://doi.org/10.1037/0096-3445.113.3.464
  9. Fenton, N., "New Media, Old News, Journalism and Democracy in the Digital Age," English language edition Published by SAGE Publications, 2009.
  10. Fox, C., "A Stop List for General Text," SIGIR forum, Vol. 24, No. 1-2, pp. 19-35, 1990.
  11. Gim, E., "A Study on the Korean Emotion Verbs," Ph.D. Thesis, Chonnam National University, 2004.
  12. Go, A., Huang, L., and Bhayani, R., "Twitter sentiment analysis," Entropy, p. 17, 2009.
  13. Godbole, N., Srinivasaiah, M., and Skiena, S., "Large-Scale Sentiment Analysis for News and Blogs," ICWSM, pp. 7-21, 2007.
  14. Hatzivassiloglou, V. and McKeown, K. R., "Predicting the semantic orientation of adjectives," Proceedings of the 35th annual meeting of the association for computational linguistics and eighth conference of the european chapter of the association for computational linguistics, Association for Computational Linguistics, pp. 174-181, 1997.
  15. Hiroshi, K., Tetsuya, N., and Hideo, W., "Deeper sentiment analysis using machine translation technology," Proceedings of the 20th international conference on Computational Linguistics. Association for Computational Linguistics, p. 494, 2004.
  16. Kamps, J., Marx, M., Mokken, R. J., and Rijke, M. De., "Using WordNet to Measure Semantic Orientations of Adjectives," LREC, Vol. 4, pp. 1115-1118, 2004.
  17. Kanhabua, N., Balnco, R., Matthews, M., "Ranking related news predictions," SIGIR 2011 Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pp. 755-764, 2011.
  18. Korea Press Foundation, "2014 Media Audience Opinion Survey," 19th User Behavior Survey of the media environment changes 2014-5, 2014.
  19. Lee, G., "Economic News and Stock Market Correlation: A Study of the UK Market," Conference on Terminology and Knowledge Engineering, 2002.
  20. Lee, S. J. and Kim, H. J., "Keyword Extraction from News Corpus using Modified TF-IDF," The Journal of Society for e-Business Studies, Vol. 14, No. 4, pp. 59-73.
  21. Lee, W. and Lim, N., "A Study on the Elements of City Brand Image and Influences," Journal of Korea Planners Association, Vol. 40, No. 6, pp. 177-192, 2005.
  22. Leon, J. A., "The effects of headlines and summaries on news comprehension and recall," Reading and Writing: An Interdisciplinary Journal, Vol. 9, pp. 85-106, 1997. https://doi.org/10.1023/A:1007928221187
  23. Liu, B., Hu, M., and Cheng, J., "Opinion observer: analyzing and comparing opinions on the web," Proceedings of the 14th international conference on World Wide Web, ACM, 2005.
  24. Mitchell, M. L. and Mulherin, J. H., "The impact of public information on the stock market," Journal of Finance, pp. 923-950, 1994.
  25. Nasukawa, T. and Yi, J., "Sentiment analysis: Capturing favorability using natural language processing," Proceedings of the 2nd international conference on Knowledge capture, ACM, pp. 70-77, 2003.
  26. Pang, B., Lee, L., and Vaithyanathan, S., "Thumbs up?: sentiment classification using machine learning techniques," Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10, Association for Computational Linguistics, pp. 79-86, 2002.
  27. Park, I. J., "The analysis of Korean affective terms: listing affective terms and exploring dimensions in the affective terms," Master thesis, Seoul National University, 2001.
  28. Peramunetilleke, D. and Wong, R. K., "Currency Exchange Rate Forecasting from News Headlines," ADC '02 Proceedings of the 13th Australasian database conference, Vol. 5, pp. 131-139, 2002.
  29. Pew Research Center, The State of the News Media 2012: An Annual Report on American Journalism, Retrieved from http://www.journalism.org/2012/10/01/future-mobile-news/>, 2012.
  30. Read, J., "Using emoticons to reduce dependency in machine learning techniques for sentiment classification," Proceedings of the ACL student research workshop, Association for Computational Linguistics, pp. 43-48, 2005.
  31. Riloff, E., Wiebe, J., and Wilson, T., "Learning subjective nouns using extraction pattern bootstrapping," Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003, Association for Computational Linguistics, Vol. 4, pp. 25-32, 2003.
  32. Salton, G., "Automatic Text Processing: The Transformation, Analysis, and Retrieval of," Reading: Addison-Wesley, 1989.
  33. Schumaker, R. P. and Chen, H., "Textual analysis of stock market prediction using breaking financial news: The AZFin text system," ACM Transactions on Information Systems(TOIS), Vol. 27, No. 2, p. 12, 2009. https://doi.org/10.1145/1462198.1462204
  34. Shaver, P., Schwartz, J., Kirson, D., and O'connor, C., "Emotion knowledge: further exploration of a prototype approach," Journal of personality and social psychology, Vol. 52, No. 6, pp. 1061-1086, 1987. https://doi.org/10.1037/0022-3514.52.6.1061
  35. Turney, P. D. and Littman, M. L., "Measuring praise and criticism: Inference of semantic orientation from association," ACM Transactions on Information Systems (TOIS), Vol. 21, No. 4, pp. 315-346, 2003. https://doi.org/10.1145/944012.944013
  36. Wilson, T., Wiebe, J., and Hoffmann, P., "Recognizing contextual polarity in phrase- level sentiment analysis," Proceedings of the conference on human language technology and empirical methods in natural language processing, Association for Computational Linguistics, pp. 347-354, 2005.
  37. Yang, C., Lin, K. H. Y., and Chen, H. H., "Emotion classification using web blog corpora," Web Intelligence, IEEE/WIC/ACM International Conference on, IEEE, pp. 275-278, 2007.
  38. Yao, J., Wu, G., Liu J., and Zheng, Y., "Using bilingual lexicon to judge sentiment orientation of Chinese words," Computer and Information Technology, 2006. CIT '06. The Sixth IEEE International Conference on, IEEE, p. 38, 2006.
  39. Yu, E., Kim, Y., Kim, N., Jeong, S. R., "Predictiong the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary," Journal of Intelligence and Information Systems, Vol. 19, No. 1, pp. 95-110, 2013. https://doi.org/10.13088/jiis.2013.19.1.095
  40. Yu, H. and Hatzivassiloglou, V., "Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences," Proceedings of the 2003 conference on Empirical methods in natural language processing, Association for Computational Linguistics, pp. 129-136, 2003.