DOI QR코드

DOI QR Code

Animal Infectious Diseases Prevention through Big Data and Deep Learning

빅데이터와 딥러닝을 활용한 동물 감염병 확산 차단

  • Kim, Sung Hyun (Big Data Project Team, Department of Big Data, National Information Society Agency) ;
  • Choi, Joon Ki (Platform Business Planning Office, BigData Business Unit, KT) ;
  • Kim, Jae Seok (Platform Business Planning Office, BigData Business Unit, KT) ;
  • Jang, Ah Reum (Platform Business Planning Office, BigData Business Unit, KT) ;
  • Lee, Jae Ho (Platform Business Planning Office, BigData Business Unit, KT) ;
  • Cha, Kyung Jin (Department of Business Administration, Kangwon National University) ;
  • Lee, Sang Won (Department of Computer & Engineering, Wonkwang University)
  • Received : 2018.10.18
  • Accepted : 2018.12.17
  • Published : 2018.12.31

Abstract

Animal infectious diseases, such as avian influenza and foot and mouth disease, occur almost every year and cause huge economic and social damage to the country. In order to prevent this, the anti-quarantine authorities have tried various human and material endeavors, but the infectious diseases have continued to occur. Avian influenza is known to be developed in 1878 and it rose as a national issue due to its high lethality. Food and mouth disease is considered as most critical animal infectious disease internationally. In a nation where this disease has not been spread, food and mouth disease is recognized as economic disease or political disease because it restricts international trade by making it complex to import processed and non-processed live stock, and also quarantine is costly. In a society where whole nation is connected by zone of life, there is no way to prevent the spread of infectious disease fully. Hence, there is a need to be aware of occurrence of the disease and to take action before it is distributed. Epidemiological investigation on definite diagnosis target is implemented and measures are taken to prevent the spread of disease according to the investigation results, simultaneously with the confirmation of both human infectious disease and animal infectious disease. The foundation of epidemiological investigation is figuring out to where one has been, and whom he or she has met. In a data perspective, this can be defined as an action taken to predict the cause of disease outbreak, outbreak location, and future infection, by collecting and analyzing geographic data and relation data. Recently, an attempt has been made to develop a prediction model of infectious disease by using Big Data and deep learning technology, but there is no active research on model building studies and case reports. KT and the Ministry of Science and ICT have been carrying out big data projects since 2014 as part of national R &D projects to analyze and predict the route of livestock related vehicles. To prevent animal infectious diseases, the researchers first developed a prediction model based on a regression analysis using vehicle movement data. After that, more accurate prediction model was constructed using machine learning algorithms such as Logistic Regression, Lasso, Support Vector Machine and Random Forest. In particular, the prediction model for 2017 added the risk of diffusion to the facilities, and the performance of the model was improved by considering the hyper-parameters of the modeling in various ways. Confusion Matrix and ROC Curve show that the model constructed in 2017 is superior to the machine learning model. The difference between the2016 model and the 2017 model is that visiting information on facilities such as feed factory and slaughter house, and information on bird livestock, which was limited to chicken and duck but now expanded to goose and quail, has been used for analysis in the later model. In addition, an explanation of the results was added to help the authorities in making decisions and to establish a basis for persuading stakeholders in 2017. This study reports an animal infectious disease prevention system which is constructed on the basis of hazardous vehicle movement, farm and environment Big Data. The significance of this study is that it describes the evolution process of the prediction model using Big Data which is used in the field and the model is expected to be more complete if the form of viruses is put into consideration. This will contribute to data utilization and analysis model development in related field. In addition, we expect that the system constructed in this study will provide more preventive and effective prevention.

조류인플루엔자와 구제역 같은 동물감염병은 거의 매년 발생하며 국가에 막대한 경제적 사회적 손실을 일으키고 있다. 이를 예방하기 위해서 그간 방역당국은 다양한 인적, 물적 노력을 기울였지만 감염병은 지속적으로 발생해 왔다. 최근 빅데이터와 딥러닝 기술을 활용하여 감염병의 예측모델을 개발하고자 하는 시도가 시작되고 있지만, 실제로 활용가능한 모델구축 연구와 사례보고는 활발히 진행되고 있지 않은 실정이다. KT와 과학기술정보통신부는 2014년부터 국가 R&D사업의 일환으로 축산관련 차량의 이동경로를 분석하여 예측하는 빅데이터 사업을 수행하고 있다. 동물감염병 예방을 위하여 연구진은 최초에는 차량이동 데이터를 활용한 회귀분석모델을 기반으로 한 예측모델을 개발하였다. 이후에는 기계학습을 활용하여 좀 더 정확한 예측 모델을 구성하였다. 특히, 2017년 예측모델에서는 시설물에 대한 확산 위험도를 추가하였고 모델링의 하이퍼 파라미터를 다양하게 고려하여 모델의 성능을 높였다. 정오분류표와 ROC 커브를 확인한 결과, 기계 학습 모델보다 2017년 구성된 모형이 우수함을 확인 할 수 있었다. 또한 2017에는 결과에 대한 설명을 추가하여 방역당국의 의사결정을 돕고 이해관계자를 설득할 수 있는 근거를 확보하였다. 본 연구는 빅데이터를 활용하여 동물감염병예방시스템을 구축한 사례연구로 모델주요변수값, 이에따른 실제예측성능결과, 그리고 상세하게 기술된 시스템구축 프로세스는 향후 감염병예방 영역의 지속적인 빅데이터활용 및 분석 모델 개발에 기여할 수 있을 것이다. 또한 본 연구에서 구축한 시스템을 통해 보다 사전적이고 효과적인 방역을 할 수 있을 것으로 기대한다.

Keywords

JJSHBB_2018_v24n4_137_f0001.png 이미지

Current Status of World FMD (2013)

JJSHBB_2018_v24n4_137_f0002.png 이미지

Current Status of Worldwide Human Damage Caused by HPAI (2003 ~ 2013)

JJSHBB_2018_v24n4_137_f0003.png 이미지

Propagation Path of Infectious Disease (animal vs. person)

JJSHBB_2018_v24n4_137_f0004.png 이미지

KAHIS System Concept

JJSHBB_2018_v24n4_137_f0005.png 이미지

Data Analysis Process

JJSHBB_2018_v24n4_137_f0006.png 이미지

2015 Prediction Model

JJSHBB_2018_v24n4_137_f0007.png 이미지

ROC Curve for Marchine Learning Model

JJSHBB_2018_v24n4_137_f0008.png 이미지

Average Error Curve for Neural Net

JJSHBB_2018_v24n4_137_f0009.png 이미지

Risk Comment Screen Shot

Animal Infection Disease Occurrence

JJSHBB_2018_v24n4_137_t0001.png 이미지

Data Status and Variable list in 2016 Model

JJSHBB_2018_v24n4_137_t0002.png 이미지

2016 Diffusion Risk Modeling Status and Results

JJSHBB_2018_v24n4_137_t0003.png 이미지

AI Outbreak Count and Statistics

JJSHBB_2018_v24n4_137_t0004.png 이미지

Machine Learning Model Result

JJSHBB_2018_v24n4_137_t0005.png 이미지

Confusion Matrix

JJSHBB_2018_v24n4_137_t0006.png 이미지

Summary of Model Evaluation

JJSHBB_2018_v24n4_137_t0007.png 이미지

References

  1. Ashish, N. and V. Dan, "Worldwide Big Data Technology and Services Forecast 2015-2019," IDC, 2015.
  2. Chen, H., R., H. L. Chiang and V. C. Storey, "Business Intelligence and Analytics: From Big Data to Big Impact," MIS Quarterly. Vol. 36, No. 4(2012), 1165-1188. https://doi.org/10.2307/41703503
  3. Gandomi, A. and M. Haider, "Beyond the Hype: Big Data Concepts, Methods, and Analytics," International Journal of Information Management, Vol. 35, No. 2(2015), 137-144. https://doi.org/10.1016/j.ijinfomgt.2014.10.007
  4. Ginsberg, J. M., H. Mohebbi, R. S. Patel, L. Brammer, M. S. Smolinski, and L. Brilliant, "Detecting Influenza Epidemics Using Search Engine Query Data", Nature, Vol. 457(2000), 1012-1014.
  5. Jagadish, H., J. Gehrke, A. Labrinidis, Y. Papakonstantinou, J. M. Patel, R. Ramakrishnan, and C. Shahabi, "Big Data and its Technical Challenges," Communications of the ACM, Vol. 57, No. 7 (2014), 86-94. https://doi.org/10.1145/2611567
  6. Jeong, Y. S., "U-healthcare Service Management Scheme for Big Data of Patient Information," Journal of IT Convergence Society for SMB, Vol. 5, No. 1(2015), 1-6.
  7. John, T. B. and C. H. Yu, Exploratory Data Analysis, Wiley, 2003.
  8. Jung, Y., M. Suk and C. Kim, "A study on the success factors of Bigdata through an analysis of introduction effect of Bigdata," Journal of Digital Convergence, Vol 12, No. 11(2014), 241-248. https://doi.org/10.14400/JDC.2014.12.11.241
  9. Kim, D. S., K. T. Kim, J. W. Kim and S. Park "A Study on the Application of Outlier Analysis for Fraud Detection: Focused on Transactions of Auction Exception Agricultural Products," Journal of Intelligence and Information Systems, Vol. 20, No. 3(2014), 93-108. https://doi.org/10.13088/jiis.2014.20.3.093
  10. Kim, S. H. and H. S. Hwang "Developing a Personalized Disease and Hospital Information Applicaion Using Medical Big Data," Entrue Journal of Information Technology, Vol. 15, No. 2(2016), 7-16.
  11. Kwon, S. J., S. H. Kim, O. S. Tak and H. H. Jeong "A Study on the Clustering Method of Row and Multiplex Housing in Seoul Using K-Means Clustering Algorithm and Hedonic Model," Journal of Intelligence and Information Systems, Vol. 23, No. 3(2017), 95-118. https://doi.org/10.13088/jiis.2017.23.1.095
  12. Kyung, M. J. and J. H. Yom,, "Implementation of Open Source SOLAP Decision-Making System for Livestock Epidemic Surveillance and Prevention." Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography, Vol.30, No.3(2012), 287-294. https://doi.org/10.7848/ksgpc.2012.30.3.287
  13. Laney, D., 3D Data Management: Controlling Data Volume, Velocity and Variety, Gartner, 2001.
  14. Lee, S. A. and N. Chang "Detection of Phantom Transaction using Data Mining: The Case of Agricultural Product Wholesale Market," Journal of Intelligence and Information Systems, Vol. 21, No. 1(2015), 161-177. https://doi.org/10.13088/jiis.2015.21.1.161
  15. MAFRA, "FMD, AI Prevention Strategy". Ministry of Agriculture, Food and Rural Affairs, 2017
  16. Manyika, J., M. Chui, B. Brown, J. Bughin, R. Dobbs, C. Roxburgh, and A. H. Byers, "Big Data: The Next Frontier for Innovation, Competition, and Productivity," McKinsey Global Institute, 2011.
  17. MSIT, Big Data Industry Development Strategy, Ministry of Science, and ICT, Gwacheon, 2013.
  18. Namn, S. H. and K. S. Noh, "A Study on the Effective Approaches to Big Data Planning," Journal of Digital Convergence, Vol. 13, No. 1(2015), 227-235. https://doi.org/10.14400/JDC.2015.13.1.227
  19. NRI, The Advent of Big Data Era, Nomura Research Institute,Tokyo, 2012.
  20. Park, H.D, "Remote monitoring site livestock control system development", MAFRA Policy Research Report, Sejoung, 2014.
  21. Park, Y. J. and J. H. Yom, "Construction of FMD Investigation Mobile System for Real-Time Collection of Disease Spatial DB." Journal of the Korean Society of Hazard Mitigation, Vol.13, No.6(2013), 215-221. https://doi.org/10.9798/KOSHAM.2013.13.6.215
  22. Shin, K. S., S. M. Chai, H. J. Park, N. O. Jo, S. A. Shin and S. H. Kim "Development of a Big Data Capability Assessment Model," Journal of Information Technology and Architecture, Vol. 13, No. 2(2016), 271-280.
  23. Syed, A., K. Gillela and C. Venugopal, "The Future Revolution on Big Data," International Journal of Advanced Research in Computer and Communication Engineering, Vol. 2, No. 6(2013), 2446-2451.
  24. Yoo, S. D., K. D. Choi and S. Y. Shin, "Characterizing Business Strategy in a New Ecosystem of Big Data," Journal of Digital Convergence, Vol. 12, No. 4(2014), 1-9. https://doi.org/10.14400/JDC.2014.12.4.1

Cited by

  1. Algorithm Design to Judge Fake News based on Bigdata and Artificial Intelligence vol.11, pp.2, 2018, https://doi.org/10.7236/ijibc.2019.11.2.50
  2. Strategy Design to Protect Personal Information on Fake News based on Bigdata and Artificial Intelligence vol.11, pp.2, 2018, https://doi.org/10.7236/ijibc.2019.11.2.59
  3. 합성곱 오토인코더 기반의 응집형 계층적 군집 분석 vol.23, pp.1, 2018, https://doi.org/10.9717/kmms.2020.23.1.001
  4. K-평균 군집화 알고리즘 및 딥러닝 기반 군중 집계를 이용한 전염병 확진자 접촉 가능성 여부 판단 모니터링 시스템 제안 vol.9, pp.3, 2018, https://doi.org/10.30693/smj.2020.9.3.122