DOI QR코드

DOI QR Code

Fake News Detection for Korean News Using Text Mining and Machine Learning Techniques

텍스트 마이닝과 기계 학습을 이용한 국내 가짜뉴스 예측

  • Yun, Tae-Uk (Graduate School of Business IT, Kookmin University) ;
  • Ahn, Hyunchul (Graduate School of Business IT, Kookmin University)
  • Received : 2018.01.29
  • Accepted : 2018.03.15
  • Published : 2018.03.31

Abstract

Fake news is defined as the news articles that are intentionally and verifiably false, and could mislead readers. Spread of fake news may provoke anxiety, chaos, fear, or irrational decisions of the public. Thus, detecting fake news and preventing its spread has become very important issue in our society. However, due to the huge amount of fake news produced every day, it is almost impossible to identify it by a human. Under this context, researchers have tried to develop automated fake news detection method using Artificial Intelligence techniques over the past years. But, unfortunately, there have been no prior studies proposed an automated fake news detection method for Korean news. In this study, we aim to detect Korean fake news using text mining and machine learning techniques. Our proposed method consists of two steps. In the first step, the news contents to be analyzed is convert to quantified values using various text mining techniques (Topic Modeling, TF-IDF, and so on). After that, in step 2, classifiers are trained using the values produced in step 1. As the classifiers, machine learning techniques such as multiple discriminant analysis, case based reasoning, artificial neural networks, and support vector machine can be applied. To validate the effectiveness of the proposed method, we collected 200 Korean news from Seoul National University's FactCheck (http://factcheck.snu.ac.kr). which provides with detailed analysis reports from about 20 media outlets and links to source documents for each case. Using this dataset, we will identify which text features are important as well as which classifiers are effective in detecting Korean fake news.

Keywords

References

  1. Ahn, H., "Optimization of Multiclass Support Vector Machine using Genetic Algorithm : Application to the Prediction of Corporate Credit Rating", Information Systems Review, Vol. 16, No. 3, 2014, pp. 161-177. https://doi.org/10.14329/isr.2014.16.3.161
  2. Bajaj, S., "The Pope Has a New Baby! : Fake News Detection Using Deep Learning", Technical Report, Stanford Univ, 2017.
  3. Conroy, N. J., Rubin, V. L., and Chen, Y., "Automatic Deception Detection : Method for Finding Fake News", Proceedings of the Association for Information Science and Technology, 2015.
  4. Han, G. and Yoon, C., "A Study on the Regulation of The Fake News", Science, Technology and Law, Vol. 8, No. 1, 2017, pp. 59-90.
  5. Hong, S. Y. and Jung, E. C., "Fake News and Journalism's Credibility Crisis-Phenomena and Alternatives-", Crisisonomy, Vol. 13, No. 8, 2017, pp. 43-60.
  6. Hwang, Y. and Kwon, O., "A Study on the Conceptualization and Regulation Measures on Fake News : Focused on self-regulation of internet service providers", Journal of Media Law, Ethics and Policy Research, Vol. 16, No. 1, 2017, pp. 53-101.
  7. Hyundai Research Institute, "Economic Cost Estimation and Implications of Fake News", Weekly Economic Review, Vol. 736, Available at http://hri.co.kr/board/reportView.asp?numIdx=27886&firstDepth=1&secondDepth=1(Accessed on March 25, 2018).
  8. Institute for Information & communications Technology Promotion, "Fake News Detection Technique Trends and Implications", Weekly ICT Trends, No. 1816, 2017, pp. 12-23.
  9. Institute for Korean Democracy, "Fake News and Democracy", Issue & Review on Democracy, No. 14, 2017.
  10. Jeon, B. and Ahn, H., "A Collaborative Filtering System Combined with Users' Review Mining : Application to the Recommendation of Smartphone Apps", Journal of Intelligence and Information Systems, Vol. 21, No. 2, 2015, pp. 1-18. https://doi.org/10.13088/jiis.2015.21.2.01
  11. Jin, Z., Cao, J., Zhang, Y., and Luo, J., "News Verification by Exploiting Conflicting Social Viewpoints in Microblogs", Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016.
  12. Kwon, S., Cha, M., and Jung, K., "Rumor detection over varying time windows", PloS one, Vol. 12, No. 1, 2017, e0168344. https://doi.org/10.1371/journal.pone.0168344
  13. Noh, H. and Ahn, H., "A study on the recommendation algorithm based on trust/distrust relationship network analysis", Journal of Information Technology Applications & Management, Vol. 24, No. 1, 2017, pp. 1-17.
  14. Salas, Z. M. d. P., Paredes, V. M. A., Rodriguez, G. M. A., Valencia, G. R., and Alor, H. G., "Automatic detection of satire in Twitter : A psycholinguistic-based approach", Knowledge-Based System, Vol. 128, 2017, pp. 20-33. https://doi.org/10.1016/j.knosys.2017.04.009
  15. Wang, W. Y., "Liar, Liar Pants on Fire : A New Benchmark Dataset for Fake News Detection", Technical Report, Dept. of Computer Science, Univ of California, 2017.

Cited by

  1. Detection of Online Fake News Using Blending Ensemble Learning vol.2021, pp.None, 2021, https://doi.org/10.1155/2021/3434458