The Comparison of Peach Price and Trading Volume Prediction Model Using Machine Learning Technique

기계학습을 이용한 복숭아 경락가격 및 거래량 예측모형 비교

  • Kim, Mihye (Department of Statistics, Division of Mathematics and Institute of Basic Science, Daegu University) ;
  • Hong, Sungmin (Department of Statistics, Division of Mathematics and Institute of Basic Science, Daegu University) ;
  • Yoon, Sanghoo (Division of Mathematics and Institute of Basic Science, Daegu University)
  • 김미혜 (대구대학교 과학생명융합대학 수리빅데이터학부) ;
  • 홍성민 (대구대학교 과학생명융합대학 수리빅데이터학부) ;
  • 윤상후 (대구대학교 과학생명융합대학 수리빅데이터학부)
  • Received : 2018.11.20
  • Accepted : 2018.12.20
  • Published : 2018.12.31

Abstract

It is known that fruit is more affected by the weather than other crops. Therefore, in order to create high value for farmers, it is necessary to develop a wholesale price model considering the weather. Peaches produced under relatively limited conditions were chosen as subjects of study. The data were collected from 2015 to 2017 provided by okdab 4.0. The meteorological data used for the analysis were generated by weighting the cultivation area and the variables with high correlation among the weather data were selected from the day before to 7 days before. Randomforest, gradient boosting machine, and XGboost were used for the analysis. As a result of analysis, XGboost showed the best performance in the sense of RMSE and correlation, and price prediction was comparatively well predicted, but the accuracy of the trading volume prediction was not so good enough. The top three weather variables affecting to the peach were minimum temperature, average maximum temperature, and precipitation.

과일의 경우 다른 작물보다 날씨의 영향을 많이 받으므로, 농업인의 고부가가치 창출을 위해서는 날씨를 고려한 작물모형개발이 필요하다. 본 연구에서는 과실류 중에서 비교적 제한된 조건에서 생산되는 복숭아를 연구대상으로 선정하였으며, 옥답 4.0에서 제공하는 2015년부터 2017년까지 대구에서 거래된 복숭아자료를 사용하였다. 분석에 사용되는 기상자료는 재배면적에 대한 가중치를 부여하여 생성하였으며, 1일 전부터 7일 전까지 날씨자료 중 상관성이 높은 변수를 사용하였다. 분석 방법으로는 기계학습법에 해당하는 랜덤포레스트와 그래디언트부스팅(gradient boosting machine), XGboost을 사용하였다. 분석결과, XGboost의 성능이 가장 우수하게 나타났으며, 경락가격 예측은 비교적 잘 예측할 수 있었지만, 거래량 예측의 정확성은 그리 높지 않았다. 복숭아 거래량 예측에 영향을 미치는 상위 3개의 기상변수로는 최저온도, 평균최대온도, 강수량으로 나타났다.

Keywords

Acknowledgement

Supported by : 대구대학교

References

  1. Breiman, L. (2001). Random forests, Machine Leaning, 45, 5-32.
  2. Candel, A., Parmar, V., LeDell, E., Arora, A. (2016). Deep learning with H2O, H2O. ai Inc.
  3. Chang, J. H., Kim, J. W., Kwak, D. E., Nasridinov, A. (2017). A correlation study between fruit wholesale price and weather factor, Korea Information Processing Society, 24(2), 706-708.
  4. Chen, T., Guestrin, C. (2016). XGBoost : A scalable tree boosting system, KDD'16, 785-794.
  5. Chen, T., He, T., Benesty, M. (2015). Xgboost: extreme gradient boosting, R package version 0.4-2, 1-4.
  6. Deschenes, O., Greenstone, M. (2007). The economic impacts of climate change : evidence from agricultural output and random fluctuations in weather, American Economic Review, 97(1), 354-385. https://doi.org/10.1257/aer.97.1.354
  7. Friedman, J. H. (2001). Greedy function approximation : A gradient boosting machine, The Institute of Mathematical Statistics, 29(5), 1189-1232.
  8. Im, J. M., Kim, W. Y., Byoun, W. J., Shin, S. J. (2018). Fruit price prediction study using artificial intelligence, The Journal of the Convergence on Culture Technology (JCCT), 4(2), 197-204. (in Korean). https://doi.org/10.17703/JCCT.2018.4.2.197
  9. Jang, D. S., Cha, K. J. (2017). Drought prediction of Seoul area based on support vector regression model adapting past time-lag, Journal of the Korean Data Analysis Society, 19(2), 675-688. (in Korean).
  10. Kim, J. T., Han, J. S. (2017). Agricultural management innovation through the adoption of internet of things : case of smart farm, Journal of Digital Convergence, 15(3), 65-75. (in Korean). https://doi.org/10.14400/JDC.2017.15.3.65
  11. Kim, K. S., Kim, S. O., Kim, J. H., Moon, K. H., Shin, J. H. (2018). Development and application of crop models in Korea, Korean Journal of Agricultural and Forest Meteorology, 20(2), 145-148. (in Korean). https://doi.org/10.5532/KJAFM.2018.20.2.145
  12. Ko, H. S., Kim, M. O., Jeong, H. C. (2017). A study on consumption patterns and management performance for daikon, Journal of the Korean Data Analysis Society, 19(3), 1391-1401. (in Korean).
  13. Lee, S. H., Kim, S. R., Park, J. W., Lee, S. H. (2018). Data matching and analysis using Korean agricultural workers' occupational disease and injury survey data, Journal of the Korean Data Analysis Society, 20(3), 1137-1143. (in Korean).
  14. Lee, H. S., Lee, S. H. (2018). A study on digital divide of farmers and fishermen, Journal of Digital Convergence, 16(1), 13-20. (in Korean). https://doi.org/10.14400/JDC.2018.16.1.013
  15. Lee, C. S., Yang, S. B. (2017). Development of yield forecast models for vegetables using artificial neural networks : The case of chilli pepper, Korea Journal of Organic Agriculture, 25(3), 555-567. (in Korean). https://doi.org/10.11625/KJOA.2017.25.3.555
  16. Liaw, A., Wiener, M. (2002). Classification and regression by randomforest, R News, 2(3), 18-22.
  17. Oh, S. W., Kim, M. S. (2017). Predicting onion production by weather and spatial time series model, Journal of the Korean Data Analysis Society, 19(5), 2447-2456. (in Korean).
  18. Okdab4.0 (2010). http://www.okdab.kr/main.do
  19. Woo, J. S., Batbaatar, E., Ryu, K. H. (2016). Association rule mining between climate factors and fruits yields, Journal of the Korea Society of Computer and Information, 24(1), 23-25. (in Korean).