DOI QR코드

DOI QR Code

Keywords and Topic Analysis of Social Issues on Twitter Based on Text Mining and Topic Modeling

텍스트 마이닝과 토픽 모델링을 기반으로 한 트위터에 나타난 사회적 이슈의 키워드 및 주제 분석

  • 곽수정 (동덕여자대학교 정보통계학과) ;
  • 김현희 (동덕여자대학교 정보통계학과)
  • Received : 2018.07.06
  • Accepted : 2018.08.26
  • Published : 2019.01.31

Abstract

In this study, we investigate important keywords and their relationships among the keywords for social issues, and analyze topics to find subjects of the social issues. In particular, we collected twitter data with the keyword 'metoo' which has attracted much attention in these days, and perform keyword analysis and topic modeling. First, we preprocess the twitter data, identified important keywords, and analyzed the relatedness of the keywords. After then, topic modeling is performed to find subjects related to 'metoo'. Our experimental results showed that relatedness of keywords and subjects on social issues in twitter are well identified based on keyword analysis and topic modeling.

본 연구는 커뮤니케이션이 활발한 SNS 속에서 사회적 이슈가 어떤 주제별로 나뉘어져 있고, 어떤 키워드들이 유기적으로 연결되었는지 그 연결 관계를 알아보고자 하였다. '미투'라는 새로운 단어가 생겨남과 동시에 큰 운동으로 번지고 있는 '미투운동'을 사회적 이슈로 간주하였고, 여러 SNS 중 특히 실시간 소통이 가장 활발한 트위터를 중심으로 분석을 실시하였다. 우선 키워드를 '미투'로 하여 관련된 키워드를 각 날짜별로 추출하였고, 주요 키워드를 파악한 후 토픽 모델링을 수행하였다. 이를 통해 사회적 이슈를 둘러싼 키워드들이 시간의 흐름에 따라 어떻게 변화하였는지 파악하고, 각 토픽 내의 키워드를 종합하여 토픽별 사회적 이슈의 다양한 관점을 해석하였다.

Keywords

JBCRJM_2019_v8n1_13_f0001.png 이미지

Fig. 1. Research Summary

JBCRJM_2019_v8n1_13_f0002.png 이미지

Fig. 2. <20180320> TF-IDF Word Cloud

JBCRJM_2019_v8n1_13_f0003.png 이미지

Fig. 3. <20180820> TF-IDF Word Cloud

JBCRJM_2019_v8n1_13_f0004.png 이미지

Fig. 4. <20180320> Topic Modeling – Topic 10

JBCRJM_2019_v8n1_13_f0005.png 이미지

Fig. 5. <20180320> Topic Modeling – Topic 20

JBCRJM_2019_v8n1_13_f0006.png 이미지

Fig. 6. <20180820> Topic Modeling – Topic 10

JBCRJM_2019_v8n1_13_f0007.png 이미지

Fig. 7. <20180820> Topic Modeling – Topic 20

Table 1. <20180320> Topic Modeling – Topic 10

JBCRJM_2019_v8n1_13_t0001.png 이미지

Table 2. <20180820> Topic Modeling – Topic 10

JBCRJM_2019_v8n1_13_t0002.png 이미지

References

  1. J. Y. An, K. B. Ahn, and M. Song, “Text Mining Driven Content Analysis of Ebola on News Media and Scientific Publications,” Journal of the Korean Society for Library and Information Science, Vol. 50, No. 2, pp. 289-307, 2016. https://doi.org/10.4275/KSLIS.2016.50.2.289
  2. B. I. Kang, M. Song, and W. S. Jho, “A Study on Opinion Mining of Newspaper Texts based on Topic Modeling,” Journal of the Korean Society for Library and Information Science, Vol. 47, No. 4, pp. 315-334, 2013. https://doi.org/10.4275/KSLIS.2013.47.4.315
  3. S. A. Jin, C. E. Heo, Y. K. Jeong, and M. Song, “Topic-Network based Topic Shift Detection on Twitter,” Journal of the Korean Society for Information Management, Vol. 30, No. 1, pp. 285-302, 2013. https://doi.org/10.3743/KOSIM.2013.30.1.285
  4. J. B. Cha, J. W. Sung, J. G. Kim, and S. H. Park, "Hell-Chosun Keyword Analysis based on Twitter," Journal of the Korean Multimedia Society, Vol. 19, No. 2, pp. 195-198, 2016.
  5. The R Project for Statistical Computing [Internet], https://www.r-project.org/
  6. S. Y. Kim, Y. M. Chung, “An Experimental Study on Selecting Association Terms Using Text Mining Techniques,” Journal of the Korean Society for Information Management, Vol. 23, No. 3, pp. 147-165, 2006. https://doi.org/10.3743/KOSIM.2006.23.3.147
  7. J. H. Park and M. Song, “A Study on the Research Trends in Library & Information Science in Korea using Topic Modeling,” Journal of the Korean Society for Information Management, Vol. 30, No. 1, pp. 7-32, 2013. https://doi.org/10.3743/KOSIM.2013.30.1.007
  8. Y. Y. Na, J. G. Park, and I. C. Moon, "Analysis of approval ratings of presidential candidates using multidimensional Gaussian process and time series text data," Proceedings of the Korean Operations Research And Management Society, Yeosu, 2017, pp. 1151-1156.
  9. C. W. Kwak, “Subject Association Analysis of Big Data Studies : Using Co-citation Networks,” Journal of the Korean Society for Information Management, Vol. 35, No. 1, pp. 13-32, 2018. https://doi.org/10.3743/KOSIM.2018.35.1.013