DOI QR코드

DOI QR Code

Topic-Network based Topic Shift Detection on Twitter

트위터 데이터를 이용한 네트워크 기반 토픽 변화 추적 연구

  • 진설아 (연세대학교 문헌정보학과 대학원) ;
  • 허고은 (연세대학교 문헌정보학과 대학원) ;
  • 정유경 (연세대학교 문헌정보학과 대학원) ;
  • 송민 (연세대학교 문헌정보학과)
  • Received : 2013.02.15
  • Accepted : 2013.03.25
  • Published : 2013.03.30

Abstract

This study identified topic shifts and patterns over time by analyzing an enormous amount of Twitter data whose characteristics are high accessibility and briefness. First, we extracted keywords for a certain product and used them for representing the topic network allows for intuitive understanding of keywords associated with topics by nodes and edges by co-word analysis. We conducted temporal analysis of term co-occurrence as well as topic modeling to examine the results of network analysis. In addition, the results of comparing topic shifts on Twitter with the corresponding retrieval results from newspapers confirm that Twitter makes immediate responses to news media and spreads the negative issues out quickly. Our findings may suggest that companies utilize the proposed technique to identify public's negative opinions as quickly as possible and to apply for the timely decision making and effective responses to their customers.

본 연구는 높은 접근성과 간결성으로 인해 방대한 양의 텍스트를 생산하는 트위터 데이터를 분석하여 토픽의 변화 시점 및 패턴을 파악하였다. 먼저 특정 상품명에 관한 키워드를 추출한 후, 동시출현단어분석(Co-word Analysis)을 이용하여 노드와 에지를 통해 토픽과 관련 키워드를 직관적으로 파악 가능한 네트워크로 표현하였다. 이후 네트워크 분석 결과를 검증하기 위해 출현빈도 기반의 시계열 분석과 LDA 토픽 모델링을 실시하였다. 또한 트위터 상의 토픽 변화와 언론 기사 검색결과를 비교한 결과, 트위터는 언론 뉴스에 즉각적으로 반응하며 부정적 이슈를 빠르게 확산시키는 것을 확인하였다. 이를 통해 기업은 대중의 부정적 의견을 신속하게 파악하고 이에 대한 즉각적인 의사결정 및 대응을 위한 도구로 본 연구방법을 활용할 수 있을 것으로 기대된다.

Keywords

Acknowledgement

Supported by : 한국연구재단

References

  1. 김성훈, 최돈정, 김재광, 정혜욱, 이지형 (2011). 트위터 게시물을 이용한 공통 관심사를 지닌 사용자 그룹 발견. 한국지능시스템학회 학술발표 논문집, 21(2), 129-131. Kim, Sung-Hun, Choi, Don Jung, Kim, Jae Kwang, Jung, Hye-Wuk, & Lee, Jee-Hyong (2011). Discovering twitter user group with common interests by tweets. Proceedings of the Korea Fuzzy Logic and Intelligent Systems Society Conference, 21(2), 129-131.
  2. 김은미, 이주현 (2011). 뉴스미디어로서의 트위터. 한국언론학보, 55(6), 152-180. Kim, Eun Mee, & Lee, Ju Hyun (2011). The diffusion of news through twitter and the emerging media ecosystem. Korean Journal of Journalism & Communication Studies, 55(6), 152-180.
  3. 송종석, 이수원 (2011). 상품평 극성 분류를 위한 특징별 서술어 긍정/부정 사전 자동 구축. 정보과학회 논문지: 소프트웨어 및 응용, 38(3), 157-168. Song, Jong Seok, & Lee, Soo Won (2011). Automatic construction of positive/negative feature-predicate dictionary for polarity classification of product reviews. Journal of KIISS: Software and Applications, 38(3), 157-168.
  4. 이원태, 차미영, 양해륜 (2011). 소셜미디어 유력자의 네트워크 특성: 한국의 트위터를 중심으로. 언론정보연구, 48(2), 44-79. Lee, Won-Tae, Cha, Mee Young, & Yang, Hae Ryun (2011). Network properties of social media influentials: Focusing on the Korean twitter community. Journal of Communication Research, 48(2), 44-79.
  5. 전선규 (1996). 불만족한 소비자의 구매 후 행동. 마케팅, 30(10), 22-26. Jun, Sun Kyu (1996). Postpurchase behavior of discontented consumers. Marketing, 30(10), 22-26.
  6. 정혜란, 지숙영, 이중식 (2010). 국내 트위터 유저분석을 위한 예비연구: "익스트림 헤비 유저"의 트위터 로그를 중심으로. 한국 HCI학회 논문지, 5(1), 37-43. Jung, Hye Lan, Ji, Soo Kyoung, & Lee, Joong Seek (2010). Preliminary research for Korean twitter user analysis focusing on extreme heavy users twitter log. Journal of the HCI Society of Korea, 5(1), 37-43. https://doi.org/10.17210/jhsk.2010.05.5.1.37
  7. 최돈정, 이성우, 김재광, 이지형 (2011). 마이크로 블로그를 통한 그래프 기반의 토픽 추출에 관한 연구. 한국지능시스템학회 논문지, 21(5), 564-568. Choi, Don-Jung, Lee, Sung-Woo, Kim, Jae-Kwang, & Lee, Jee-Hyong (2011). A study on graph-based topic extraction from microblogs. Journal of Korean Institute of Intelligent Systems, 21(5), 564-568. https://doi.org/10.5391/JKIIS.2011.21.5.564
  8. 하용호, 임성원, 김용혁 (2012). 내용기반 트윗 클러스터링을 통한 트렌드 분석. 한국정보과학회 학술발표논문집, 39(2B), 210-212. Ha, Yong-Ho, Lim, Seong Won, & Kim, Yong-Hyuk (2012). Trend analysis through content-based tweet clustering. Proceedings of the Korean Information Science Society Conference, 39(2B), 210-212.
  9. 황유선, 심홍진 (2010). 트위터에서의 의견 지도력과 트위터 이용패턴: 이용동기, 트윗 이용패턴, 그리고 유형별 사례분석. 한국방송학보, 24(6), 365-404. Hwang, Yoo Sun, & Shim, Hong-Jin (2010). Opinion leadership on twitter and twitter use : Motivations and patterns of twitter use and case study of opinion leaders on twitter. Korean Journal of Broadcasting, 24(6), 365-404.
  10. Asur, S., & Huberman, B. A. (2010). Predicting the future with social media. Retrieved from http://arxiv.org/abs/1003.5699
  11. Bermingham, A., & Smeaton, A. F. (2011). On using twitter to monitor political sentiment and predict election results. In: Sentiment Analysis Where AI Meets Psychology (SAAIP) Workshop at the International Joint Conference for Natural Language Processing (IJCNLP). Retrieved from http://doras.dcu.ie/16670/1/saaip2011.pdf
  12. Blei, D. M., Ng, A. Y., Jordan, M. I., & Lafferty, J. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993-1022. Retrieved from http://jmlr.csail.mit.edu/papers/v3/blei03a.html
  13. Chen, Q., Shipper, T., & Khan, L. (2010). Tweets mining using WIKIPEDIA and impurity cluster measurement. IEEE ISI 2010, 141-143. http://dx.doi.org/10.1109/ISI.2010.5484758
  14. Davidiv, D., Oren Tsur, O., & Rappoport, A. (2010). Enhanced sentiment learning using twitter hashtags and smileys. Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), 241-249.
  15. Esuli, A., & Sebastiani, F. (2006). Determining term subjectivity and term orientation for opinion mining. Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL-06), 193-200. Retrieved from http://acl.ldc.upenn.edu/eacl2006/main/papers/13_1_esulisebastiani_192.pdf
  16. Go, A., Bhayani, R., & Hunag, L. (2009). Twitter sentiment classification using distant supervision. Technical report, Stanford University.
  17. Java, A., Song, X., Finnin, T., & Tseng, B. (2007). Why we twitter: Understanding microblogging usage and communities. Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis (WebKDD/SNA-KDD '07), 56-65.
  18. Jiang, L., Yu, M., Zhou, M., Liu, X., & Zhao, T. (2011). Target-dependent twitter sentiment classification. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Vol. 1, 151-160.
  19. Mimno, D., & MacCallum, A. (2008). Topic models conditioned on arbitrary features with dirichlet-multinomial regression. Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence (UAI '08). Retrieved from http://people.cs.umass.edu/-mccallum/papers/dmr-uai.pdf
  20. O'Connor, B., Balasubramanyan, R., Routledge, B. R., & Smith, N. A. (2010). From tweets to polls: Linking text sentiment to public opinion time series. Proceedings of International AAAI Conference on Weblogs and Social Media, 122-129.
  21. Pennacchiotti, M., & Popescu, A. M. (2011). A machine learning approach to twitter user classification. Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, 281-288. Retrieved from https://www.aaai.org/ocs/index.php/ICWSM/ICWSM11/paper/view/2886
  22. Sakaki, T., Toriumi, F., & Matsuo, Y. (2011). Tweet trend analysis in an emergency situation. Proceedings of the Special Workshop on Internet and Disasters (SWID '11), Article No. 3. http://dx.doi.org/10.1145/2079360.2079363
  23. Strapparava, C., Gliozzo, A., & Giuliano, C. (2004). Pattern abstraction and term similarity for word sense disambiguation: IRST at Senseval-3. Proceedings of the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text (Senseval-3), 229-234.
  24. Tumasjan, A., Sprenger, T. O., Sandner, P. G., & Welpe, I. M. (2010). Predicting elections with twitter: What 140 characters reveal about political sentiment. Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media, 178-185.
  25. Wang, X., Wei, F., Liu, X., Zhou, M., & Zhang, M. (2011). Topic sentiment analysis in twitter: A graph-based hashtag sentiment classification approach. Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM '11), 1031-1040.

Cited by

  1. An Efficient Method for Design and Implementation of Tweet Analysis System vol.13, pp.2, 2015, https://doi.org/10.14400/JDC.2015.13.2.43
  2. Analysis of patterns in meteorological research and development using a text-mining algorithm vol.29, pp.5, 2016, https://doi.org/10.5351/KJAS.2016.29.5.935
  3. Twitter Issue Tracking System by Topic Modeling Techniques vol.20, pp.2, 2014, https://doi.org/10.13088/jiis.2014.20.2.109
  4. Developing a Methodology of Structuring and Layering Technological Information in Patent Documents through Natural Language Processing vol.9, pp.11, 2017, https://doi.org/10.3390/su9112117
  5. A Comparative Analysis of Social Commerce and Open Market Using User Reviews in Korean Mobile Commerce vol.21, pp.4, 2015, https://doi.org/10.13088/jiis.2015.21.4.053
  6. An Exploratory Study on Mobile App Review through Comparative Analysis between South Korea and U.S. vol.15, pp.2, 2016, https://doi.org/10.9716/KITS.2016.15.2.169
  7. A Bibliometric Analysis on Twitter Research vol.31, pp.3, 2014, https://doi.org/10.3743/KOSIM.2014.31.3.293
  8. A Study on Opinion Mining of Newspaper Texts based on Topic Modeling vol.47, pp.4, 2013, https://doi.org/10.4275/KSLIS.2013.47.4.315
  9. Comparative Analysis of Job Satisfaction Factors, Using LDA Topic Modeling by Industries : The Case Study of Job Planet Reviews vol.15, pp.3, 2016, https://doi.org/10.9716/KITS.2016.15.3.157
  10. Topic Model Analysis of Research Trend on Renewable Energy vol.16, pp.9, 2015, https://doi.org/10.5762/KAIS.2015.16.9.6411
  11. A Method for Evaluating News Value based on Supply and Demand of Information Using Text Analysis vol.22, pp.4, 2016, https://doi.org/10.13088/jiis.2016.22.4.045