DOI QR코드

DOI QR Code

Development of Product Recommender System using Collaborative Filtering and Stacking Model

협업필터링과 스태킹 모형을 이용한 상품추천시스템 개발

  • Park, Sung-Jong (Department of Information and Statistics, Yonsei University) ;
  • Kim, Young-Min (Department of Bigdata Engineering, Soonchunhyang University) ;
  • Ahn, Jae-Joon (Department of Information and Statistics, Yonsei University)
  • 박성종 (연세대학교 정보통계학과) ;
  • 김영민 (순천향대학교 빅데이터공학과) ;
  • 안재준 (연세대학교 정보통계학과)
  • Received : 2019.05.16
  • Accepted : 2019.06.20
  • Published : 2019.06.28

Abstract

People constantly strive for better choices. For this reason, recommender system has been developed since the early 1990s. In particular, collaborative filtering technique has shown excellent performance in the field of recommender systems, and research of recommender system using machine learning has been actively conducted. This study constructs recommender system using collaborative filtering and machine learning based on stacking model which is one of ensemble methods. The results of this study confirm that the recommender system with the stacking model is useful in aspects of recommender performance. In the future, the model proposed in this study is expected to help individuals or firms to make better choices.

사람들은 자신의 더 나은 선택을 위하여 끊임없이 노력한다. 이러한 이유로 추천시스템이 개발되었으며, 1990년대 초반부터 계속해서 발전하고 있다. 그 중, 협업필터링 기법은 추천시스템 분야에서 우수한 성능을 보였으며, 기계학습이 등장하면서 기계학습을 이용한 추천시스템에 관한 연구가 활발히 진행되었다. 본 연구는 앙상블 방법 중에서 스태킹 모형을 사용하여 추천시스템을 구축하며, 실제 고객의 상품 구매 데이터를 활용하여 협업필터링과 기계학습 기반 스태킹 모형으로 추천시스템을 개발하였다. 제시한 모형의 추천 성능은 기존의 협업필터링과 기계학습 기반 추천시스템과 비교하여 모형의 우수성을 확인하며, 연구결과는 스태킹 모형을 이용한 추천시스템 모형의 추천 성능이 개선됨을 확인하였다. 향후 본 연구에서 제안한 모형은 개인이나 기업이 더 나은 선택을 하여 상품을 추천할 때 도움을 줄 것으로 기대한다.

Keywords

JKOHBZ_2019_v9n6_83_f0001.png 이미지

Fig. 1. Example of new data set generated by stacking

JKOHBZ_2019_v9n6_83_f0002.png 이미지

Fig. 2. Diagram of general stacking process

JKOHBZ_2019_v9n6_83_f0003.png 이미지

Fig. 3. Flowchart of proposed stacking architecture

JKOHBZ_2019_v9n6_83_f0004.png 이미지

Fig. 4. Misclassification rate(%) according to the number of neighbors in each window

Table 1. Example of confusion matrix

JKOHBZ_2019_v9n6_83_t0001.png 이미지

Table 2. Training period and test period assigned to each window

JKOHBZ_2019_v9n6_83_t0002.png 이미지

Table 3. Misclassification rate(%) of collaborative filtering when K value is under 20

JKOHBZ_2019_v9n6_83_t0003.png 이미지

Table 4. Misclassification rates(%) of proposed stacking model

JKOHBZ_2019_v9n6_83_t0004.png 이미지

Table 5. Misclassification rates(%) of machine learning classifiers without applying stacking model

JKOHBZ_2019_v9n6_83_t0005.png 이미지

References

  1. P. Resnick & H. R. Varian. (1997). Recommender systems. Communications of the ACM, 40, 56-58.
  2. J. Bennett & S. Lanning. (2007). The netflix prize. Proceedings of KDD cup and workshop, 35.
  3. D. Goldberg, D. Nichols, B. M. Oki & D. Terry. (1992). Using collaborative filtering to weave an information tapestry. Communications of the ACM, 35, 61-70.
  4. B. S. Kang. (2019). A Study on the Accuracy Improvement of Movie Recommender System Using Word2Vec and Ensemble Convolutional Neural Networks. Journal of Digital Convergence, 17(1), 123-130. https://doi.org/10.14400/JDC.2019.17.1.123
  5. Z. D. Zhao & M. S. Shang. (2010). User-based collaborative-filtering recommendation algorithms on hadoop. IEEE, 2010 Third International Conference on Knowledge Discovery and Data Mining, 478-481.
  6. H. C. Lee. (2006). Improved algorithm for user based recommender system. Journal of the Korean Data & Information Science Society, 17, 717-726.
  7. B. M. Marlin. (2003). Modeling user rating profiles for collaborative filtering. Advances in neural information processing systems, 16, 627-634.
  8. G. Linden, B. Smith & J. York. (2003). Amazon. com recommendations: Item-to-item collaborative filtering. IEEE Internet computing, 7, 76-80. https://doi.org/10.1109/MIC.2003.1167344
  9. S. Lee. (2014). A New Collaborative Filtering Method for Movie Recommendation Using Genre Interest. Journal of Digital Convergence, 12(8), 329-335. https://doi.org/10.14400/JDC.2014.12.8.329
  10. J. T. Oh & S. Y. Lee. (2017). A Movie Recommendation System based on Fuzzy-AHP with User Preference and Partition Algorithm. Journal of Digital Convergence, 15(11), 425-432. https://doi.org/10.14400/JDC.2017.15.11.425
  11. N. K. Kim & S. Y. Lee. (2013). Bayesian network based Music Recommendation System considering Multi-Criteria Decision Making. Journal of Digital Convergence, 11(3), 345-352. https://doi.org/10.14400/JDPM.2013.11.12.345
  12. K. Tsuji, F. Yoshikane, S. Sato & H. Itsumura. (2014). Book Recommendation Using Machine Learning Methods Based on Library Loan Records and Bibliographic Information. IEEE, 2014 IIAI 3rd International Conference on Advanced Applied Informatics. (pp. 76-79).
  13. M. Piotte & M. Chabbert. (2009). The pragmatic theory solution to the netflix grand prize. Netflix prize documentation, Canada
  14. I. Portugal, P. Alencar & D. Cowan. (2017). The use of machine learning algorithms in recommender systems: A systematic review. Expert Systems with Applications, 97, 205-227. https://doi.org/10.1016/j.eswa.2017.12.020
  15. R. Polikar. (2006). Ensemble based systems in decision making. IEEE Circuits and systems magazine, 6, 21-45. https://doi.org/10.1109/MCAS.2006.1688199
  16. A. Ekbal & S. Saha. (2013). Stacked ensemble coupled with feature selection for biomedical entity extraction. Knowledge-Based Systems, 46, 22-32. https://doi.org/10.1016/j.knosys.2013.02.008
  17. J. Thorne, M. Chen, G. Myrianthous, J. Pu, X. Wang & A. Vlachos. (2017). Fake news stance detection using stacked ensemble of classifiers. In Proceedings of the 2017 EMNLP Workshop: Natural Language Processing meets Journalism. (pp. 80-83).
  18. C. F. Tsai. (2005). Training support vector machines based on stacked generalization for image classification. Neurocomputing, 64, 497-503. https://doi.org/10.1016/j.neucom.2004.08.005
  19. Y. Huang, G. Zhang, & X. Xu. (2009, November). Speech emotion recognition research based on the stacked generalization ensemble neural network for robot pet. In 2009 Chinese Conference on Pattern Recognition (pp. 1-5). IEEE.
  20. Z. Cao, X. Pan, Y. Yang, Y. Huang & H. B. Shen. (2018). The lncLocator: a subcellular localization predictor for long non-coding RNAs based on a stacked ensemble classifier. Bioinformatics, 34(13), 2185-2194. https://doi.org/10.1093/bioinformatics/bty085
  21. J. L. Herlocker, J. A. Konstan, L. G. Terveen & J. T. Riedl. (2004). Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems (TOIS), 22(1), 5-53. https://doi.org/10.1145/963770.963772
  22. H. Kim, G. Yang, H. Jung, S. H. Lee & J. J. Ahn. (2019). An intelligent product recommendation model to reflect the recent purchasing patterns of customers. Mobile Networks and Applications, 24(1), 163-170. https://doi.org/10.1007/s11036-017-0986-7
  23. S. J. Lee. (2009). A study on neighbor selection methods in k-NN collaborative filtering recommender system. Journal of the Korean Data & Information Science Society, 20, 809-818.
  24. J. Herlocker, J. A. Konstan & J. Riedl. (2002). An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Information retrieval, 5(4), 287-310. https://doi.org/10.1023/A:1020443909834
  25. L. Rokach. (2010). Ensemble-based classifiers. Artificial Intelligence Review, 33, 1-39. https://doi.org/10.1007/s10462-009-9124-7
  26. K. S. Eo & K. C. Lee. (2019). Investigating Opinion Mining Performance by Combining Feature Selection Methods with Word Embedding and BOW (Bag-of-Words). Journal of Digital Convergence, 17(2), 163-170. https://doi.org/10.14400/JDC.2019.17.2.163
  27. J. Yan & S. Han. (2018). Classifying Imbalanced Data Sets by a Novel RE-Sample and Cost-Sensitive Stacked Generalization Method. Mathematical Problems in Engineering, 2018.
  28. F. Gunes, R. Wolfinger & P. Y. Tan. (2017). Stacked Ensemble Models for Improved Prediction Accuracy. SAS Global Forum 2017, SAS0437-2017.
  29. C. Kim, T. Y. Kim, I. Park & J. J. Ahn. (2015). A study on the improvement of the economic sentiment index for the Korean economy. Journal of the Korean Data & Information Science Society, 26, 1335-1351. https://doi.org/10.7465/jkdi.2015.26.6.1335
  30. D. H. Wolpert. (1992). Stacked Generalization. Neural networks, 5(2), 241-259. https://doi.org/10.1016/S0893-6080(05)80023-1