DOI QR코드

DOI QR Code

Evaporative demand drought index forecasting in Busan-Ulsan-Gyeongnam region using machine learning methods

기계학습기법을 이용한 부산-울산-경남 지역의 증발수요 가뭄지수 예측

  • Lee, Okjeong (Water Resources Management Research Center, K-water Research Institute) ;
  • Won, Jeongeun (Division of Earth Environmental System Science (Major of Environmental Engineering). Pukyong National University) ;
  • Seo, Jiyu (Division of Earth Environmental System Science (Major of Environmental Engineering). Pukyong National University) ;
  • Kim, Sangdan (Department of Environmental Engineering, Pukyong National University)
  • 이옥정 (K-water연구원 유역물관리연구소) ;
  • 원정은 (부경대학교 지구환경시스템과학부 환경공학전공) ;
  • 서지유 (부경대학교 지구환경시스템과학부 환경공학전공) ;
  • 김상단 (부경대학교 환경공학과)
  • Received : 2021.05.25
  • Accepted : 2021.06.14
  • Published : 2021.08.31

Abstract

Drought is a major natural disaster that causes serious social and economic losses. Local drought forecasts can provide important information for drought preparedness. In this study, we propose a new machine learning model that predicts drought by using historical drought indices and meteorological data from 10 sites from 1981 to 2020 in the southeastern part of the Korean Peninsula, Busan-Ulsan-Gyeongnam. Using Bayesian optimization techniques, a hyper-parameter-tuned Random Forest, XGBoost, and Light GBM model were constructed to predict the evaporative demand drought index on a 6-month time scale after 1-month. The model performance was compared by constructing a single site model and a regional model, respectively. In addition, the possibility of improving the model performance was examined by constructing a fine-tuned model using data from a individual site based on the regional model.

가뭄은 심각한 사회적 경제적 손실을 초래하는 주요 자연재해이다. 지역 가뭄 예측은 가뭄 대비에 중요한 정보를 제공할 수 있다. 본 연구에서는 한반도 동남부 부산-울산-경남 지역에서 1981년부터 2020년까지 10개 관측소의 과거 가뭄지수 및 기상 관측자료를 사용하여 가뭄을 예측하는 새로운 기계학습모델을 제안한다. 베이지안 최적화기법을 이용하여 하이퍼 파라미터가 튜닝된 Random Forest, XGBoost, Light GBM 모델을 구축하여 1개월 뒤의 6개월 시간 척도의 증발 수요 가뭄지수를 예측하였다. 단일 지점별 모델과 지역 모델을 각각 구성하여 모델 성능을 비교하였다. 또한 지역 모델을 기반으로 개별 지점의 자료에 대해 미세조정된 모델을 구성하여 모델 성능을 높일 가능성을 살펴보았다.

Keywords

Acknowledgement

본 연구는 정부(과학기술정보통신부)의 재원으로 한국연구재단의 지원을 받아 수행되었음(NRF-2019R1A2C1003114).

References

  1. Adam-Bourdarios, C., Cowan, G., Germain, C., Guyon, I., Kegl, B., and Rousseau, D. (2015). "The Higgs boson machine learning challenge." NIPS 2014 Workshop on High-energy Physics and Machine Learning, PMLR, Montreal, Canada, pp. 19-55.
  2. Agrawala, M., and Stolte, C. (2001). "Rendering effective route maps: Improving usability through generalization." Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH, Los Angeles, CA, U.S., pp. 241-249.
  3. Allen, R.G., Pereira, L.S., Raes, D., and Smith, M. (1998). Crop evapotranspiration-Guidelines for computing crop water requirements-FAO Irrigation and drainage paper 56. Fao, Rome, Italy.
  4. Asoka, A., and Mishra, V. (2015). "Prediction of vegetation anomalies to improve food security and water management in India." Geophysical Research Letters, Vol. 42, No. 13, pp. 5290-5298. https://doi.org/10.1002/2015GL063991
  5. Bergstra, J., Yamins, D., and Cox, D. (2013). "Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms." Proceedings of the 12th Python in Science Conference, SciPy, Austin, TX, U.S., Vol. 13, p. 20.
  6. Breiman, L. (2001). "Random forests." Machine Learning, Vol. 45, No. 1, pp. 5-32. https://doi.org/10.1023/A:1010933404324
  7. Carmona, P., Climent, F., and Momparler, A. (2018). "Predicting failure in the U.S. banking sector: An extreme gradient boosting approach." International Review of Economics & Finance, Vol. 61, pp. 304-323. https://doi.org/10.1016/j.iref.2018.03.008
  8. Chen, T., and Guestrin, C. (2016). "XGBoost: A scalable tree boosting system." 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, KDD, San Francisco, CA, U.S.
  9. Dikshit, A., Pradhan, B., and Alamri, A. (2021). "Long lead time drought forecasting using lagged climate variables and a stacked long short-term memory model." Science of The Total Environment, Vol. 755, 142638. https://doi.org/10.1016/j.scitotenv.2020.142638
  10. Fan, J., Wang, X., Wu, L., Zhou, H., Zhang, F., Yu, X., Lu, X., and Xiang, Y. (2018). "Comparison of support vector machine and extreme gradient boosting for predicting daily global solar radiation using temperature and precipitation in humid subtropical climates: A case study in China." Energy Convertsion and Management, Vol. 164, pp. 102-111. https://doi.org/10.1016/j.enconman.2018.02.087
  11. Fawcett, T. (2006). "An introduction to ROC analysis." Pattern Recognition Letters, Vol. 27, No. 8, pp. 861-874. https://doi.org/10.1016/j.patrec.2005.10.010
  12. Friedman, J. (2001). "Greedy function approximation: A gradient boosting machine." Annals of Statistics, Vol. 29, No. 5, pp. 1189-1232. https://doi.org/10.1214/aos/1013203451
  13. Hestness, J., Narang, S., Ardalani, N., Diamos, G.F., Jun, H., Kianinejad, H., Patwary, M.M.A., Yang, Y., and Zhou, Y. (2017) Deep learning scaling is predictable, empirically, available, accessed 21 November 2018, .
  14. Hobbins, M., Wood, A., McEvoy, D., Huntington, J., Morton, C., Anderson, M., and Hain, C. (2016). "The evaporative demand drought index. Part I: Linking drought evolution to variations in evaporative demand." Journal of Hydrometeorology, Vol. 17, No. 6, pp. 1745-1761. https://doi.org/10.1175/JHM-D-15-0121.1
  15. Howitt, R., Medellin-Azuara, J., MacEwan, D., Lund, J.R., and Sumner, D. (2014). Economic analysis of the 2014 drought for California agriculture. Tech. Rep., Center for Watershed Sciences, University of California, Davis, CA, U.S., p. 20.
  16. Jeong, M., Kim, J., Jang, H., and Lee, J. (2016). "ROC evaluation for MLP ANN drought forecasting model." Journal of Korea Water Resources Association, Vol. 49, No. 10, pp. 877-885. https://doi.org/10.3741/JKWRA.2016.49.10.877
  17. Kim, G., and Lee, J. (2011). "Evaluation on drought indices using the drought Records." Journal of Korea Water Resources Association, Vol. 44, No. 8, pp. 639-652. https://doi.org/10.3741/JKWRA.2011.44.8.639
  18. Kopitar, L., Kocbek, P., Cilar, L., Sheikh, A., and Stiglic, G. (2020). "Early detection of type 2 diabetes mellitus using machine learning-based prediction models." Scientific Reports, Vol. 10, No. 1, 11981. https://doi.org/10.1038/s41598-020-68771-z
  19. Le, M., Perez, G., Solomatine, D., and Nguyen, L. (2016). "Meteorological drought forecasting based on climate signals using artificial neural network - a case study in Khanhhoa Province Vietnam." Procedia Engineering, Vol. 154, pp. 1169-1175. https://doi.org/10.1016/j.proeng.2016.07.528
  20. Lee, J., Kim, J., Jang, H., and Lee, J. (2013). "Drought forecasting using the Multi Layer Perceptron (MLP) artificial neural network model." Journal of Korea Water Resources Association, Vol. 46, No. 12, pp. 1249-1263. https://doi.org/10.3741/JKWRA.2013.46.12.1249
  21. Lim, J.D., and Yang, J.S. (2020). "Possibility analysis of future droughts using long short term memory and standardized groundwater level index." Journal of Korea Water Resources Association, Vol. 53, No. 2, pp. 131-140. https://doi.org/10.3741/JKWRA.2020.53.2.131
  22. Livne, M., Boldsen, J., Mikkelsen, I., Fiebach, J., Sobesky, J., and Mouridsen, K. (2018). "Boosted tree model reformsmultimodal magnetic resonance imaging infarct prediction in acute stroke." Stroke, Vol. 49, No. 4, pp. 912-918. https://doi.org/10.1161/STROKEAHA.117.019440
  23. Ma, F., Luo, L., Ye, A., and Duan, Q. (2018). "Seasonal drought predictability and forecast skill in the semi-arid endorheic Heihe River basin in northwestern China." Hydrology and Earth System Sciences,Vol. 22, No. 11, pp. 5697-5709. https://doi.org/10.5194/hess-22-5697-2018
  24. McGUIRE, J., and Palmer, W. (1957). "The 1957 drought in the eastern United States." Monthly Weather Review, Vol. 85, No. 9, pp. 305-314. https://doi.org/10.1175/1520-0493(1957)085<0305:TDITEU>2.0.CO;2
  25. McKee, T.B., Doesken, N.J., and Kleist, J. (1993). "The relationship of drought frequency and duration to time scales." Proceedings of the 8th Conference on Applied Climatology, Springer, Anaheim, CA, U.S., Vol. 17, No. 22, pp. 179-183.
  26. McKinney, W. (2010). "Data structures for statistical computing in python." Proceedings of the 9th Python in Science Conference, SciPy, Austin, TX, U.S., Vol. 445, pp. 51-56.
  27. Mishra, A., Desai, V., and Singh, V. (2007). "Drought forecasting using a hybrid stochastic and neural network model." Journal of Hydrologic Engineering, Vol. 12, pp. 626-638. https://doi.org/10.1061/(ASCE)1084-0699(2007)12:6(626)
  28. Morid, S., Smakhtin, V., and Bagherzadeh, K. (2007). "Drought forecasting using artificial neural networks and time series of drought indices." International Journal of Climatology: A Journal of the Royal Meteorological Society, Vol. 27, No. 15, pp. 2103-2111. https://doi.org/10.1002/joc.1498
  29. Nash, J., and Sutcliffe, J. (1970). "River flow forecasting through conceptual models part I-A discussion of principles." Journal of Hydrology, Vol. 10, No. 3, pp. 282-290. https://doi.org/10.1016/0022-1694(70)90255-6
  30. Palmer, W.C. (1965). Meteorological drought (Vol. 30). US Department of Commerce, Weather Bureau, Silver Spring, MD, U.S.
  31. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). "Scikit-learn: Machine learning in Python." Journal of Machine Learning Research, Vol. 12, pp. 2825-2830.
  32. Piryonesi, S., and El-Diraby, T. (2020). "Role of data analytics in infrastructure asset management: Overcoming data size and quality problems." Journal of Transportation Engineering, Part B: Pavements, Vol. 146, No. 2, 04020022. https://doi.org/10.1061/jpeodx.0000175
  33. Piryonesi, S., and El-Diraby, T. (2021). "Using machine learning to examine impact of type of performance indicator on flexible pavement deterioration modeling." Journal of Infrastructure Systems, Vol. 27, No. 2, 04021005. https://doi.org/10.1061/(ASCE)IS.1943-555X.0000602
  34. Razavian, A.S., Azizpour, H., Sullivan, J., and Carlsson, S. (2014). "CNN features off-the-shelf: an astounding baseline for recognition." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Columbus, OH, U.S., pp. 806-813.
  35. Schmidhuber, J. (2015). "Deep learning in neural networks: An overview." Neural Networks, Vol. 61, pp. 85-117. https://doi.org/10.1016/j.neunet.2014.09.003
  36. Shen, Z., Zhang, Q., Singh, V., Sun, P., Song, C., and Yu, H. (2019). "Agricultural drought monitoring across Inner Mongolia, China: Model development, spatiotemporal patterns and impacts." Journal of Hydrology, Vol. 571, pp. 793-804. https://doi.org/10.1016/j.jhydrol.2019.02.028
  37. Tadesse, G., Zavaleta, E., Shennan, C., and FitzSimmons, M. (2014). "Prospects for forest-based ecosystem services in forest-coffee mosaics as forest loss continues in southwestern Ethiopia." Applied Geography, Vol. 50, pp. 144-151. https://doi.org/10.1016/j.apgeog.2014.03.004
  38. Taylor, R., Moore, C., Cheung, K., and Brandt, C. (2018). "Predicting urinary tract infections in the emergency department with machine learning." PLoS One, Vol. 13, No. 3, e194085.
  39. Van Der Walt, S., Colbert, S., and Varoquaux, G. (2011). "The NumPy array: A structure for efficient numerical computation." Computing in Science & Engineering, Vol. 13, No. 2, pp. 22-30. https://doi.org/10.1109/MCSE.2011.37
  40. Van Rossum, G., and Drake Jr, F.L. (1995). Python tutorial, Python Software Foundation, Amsterdam, Netherlands.
  41. Vicente-Serrano, S., Begueria, S., and Lopez-Moreno, J. (2010). "A multiscalar drought index sensitive to global warming: The standardized precipitation evapotranspiration index." Journal of Climate, Vol. 23, No. 7, pp. 1696-1718. https://doi.org/10.1175/2009JCLI2909.1
  42. Won, J., Choi, J., Lee, O., and Kim, S. (2020). "Copula-based Joint Drought Index using SPI and EDDI and its application to climate change." Science of the Total Environment, Vol. 744, 140701. https://doi.org/10.1016/j.scitotenv.2020.140701
  43. Won, J., Jang, S., Kim, K., and Kim, S. (2018). "Applicability of the evaporative demand drought index." Journal of the Korean Society of Hazard Mitigation, Vol. 18, No. 6, pp. 431-442. https://doi.org/10.9798/kosham.2018.18.6.431
  44. Woo, S., Jung, C., Kim, J., and Kim, S. (2018). "Assessment of climate change impact on aquatic ecology health indices in Han river basin using SWAT and random forest." Journal of Korea Water Resources Association, Vol. 51, No. 10, pp. 863-874. https://doi.org/10.3741/JKWRA.2018.51.10.863
  45. Yao, N., Li, Y., Lei, T., and Peng, L. (2018). "Drought evolution, severity and trends in mainland China over 1961-2013." Science of the Total Environment, Vol. 616, pp. 73-89. https://doi.org/10.1016/j.scitotenv.2017.10.327
  46. Yoo, J., Song, H., Kim, T., and Ahn, J. (2013). "Evaluation of short-term drought using daily standardized precipitation index and ROC analysis." Journal of The Korean Society of Civil Engineers, Vol. 33, No. 5, pp. 1851-1860. https://doi.org/10.12652/Ksce.2013.33.5.1851
  47. Yosinski, J., Clune, J., Bengio, Y., and Lipson, H. (2014). How transferable are features in deep neural networks?, accessed 24 February 2021, .
  48. Yuan, X., Zhang, M., Wang, L., and Zhou, T. (2017). "Understanding and seasonal forecasting of hydrological drought in the Anthropocene." Hydrology and Earth System Sciences, Vol. 21, No. 11, pp. 5477-5492. https://doi.org/10.5194/hess-21-5477-2017
  49. Zhang, H., Si, S., and Hsieh, C.J. (2017). GPU-acceleration for Large-scale tree boosting, accessed 4 February 2021, .
  50. Zhang, S., Yao, L., Sun, A., and Tay, Y. (2019). "Deep learning based recommender system: A survey and new perspectives." ACM Computing Surveys (CSUR), Vol. 52, No. 1, pp. 1-38. https://doi.org/10.1145/3158369