DOI QR코드

DOI QR Code

Comparison of Models for Stock Price Prediction Based on Keyword Search Volume According to the Social Acceptance of Artificial Intelligence

인공지능의 사회적 수용도에 따른 키워드 검색량 기반 주가예측모형 비교연구

  • 조유정 (경희대학교 빅데이터응용학과) ;
  • 손권상 (경희대학교 경영학과) ;
  • 권오병 (경희대학교 경영학과)
  • Received : 2020.12.30
  • Accepted : 2021.03.08
  • Published : 2021.03.31

Abstract

Recently, investors' interest and the influence of stock-related information dissemination are being considered as significant factors that explain stock returns and volume. Besides, companies that develop, distribute, or utilize innovative new technologies such as artificial intelligence have a problem that it is difficult to accurately predict a company's future stock returns and volatility due to macro-environment and market uncertainty. Market uncertainty is recognized as an obstacle to the activation and spread of artificial intelligence technology, so research is needed to mitigate this. Hence, the purpose of this study is to propose a machine learning model that predicts the volatility of a company's stock price by using the internet search volume of artificial intelligence-related technology keywords as a measure of the interest of investors. To this end, for predicting the stock market, we using the VAR(Vector Auto Regression) and deep neural network LSTM (Long Short-Term Memory). And the stock price prediction performance using keyword search volume is compared according to the technology's social acceptance stage. In addition, we also conduct the analysis of sub-technology of artificial intelligence technology to examine the change in the search volume of detailed technology keywords according to the technology acceptance stage and the effect of interest in specific technology on the stock market forecast. To this end, in this study, the words artificial intelligence, deep learning, machine learning were selected as keywords. Next, we investigated how many keywords each week appeared in online documents for five years from January 1, 2015, to December 31, 2019. The stock price and transaction volume data of KOSDAQ listed companies were also collected and used for analysis. As a result, we found that the keyword search volume for artificial intelligence technology increased as the social acceptance of artificial intelligence technology increased. In particular, starting from AlphaGo Shock, the keyword search volume for artificial intelligence itself and detailed technologies such as machine learning and deep learning appeared to increase. Also, the keyword search volume for artificial intelligence technology increases as the social acceptance stage progresses. It showed high accuracy, and it was confirmed that the acceptance stages showing the best prediction performance were different for each keyword. As a result of stock price prediction based on keyword search volume for each social acceptance stage of artificial intelligence technologies classified in this study, the awareness stage's prediction accuracy was found to be the highest. The prediction accuracy was different according to the keywords used in the stock price prediction model for each social acceptance stage. Therefore, when constructing a stock price prediction model using technology keywords, it is necessary to consider social acceptance of the technology and sub-technology classification. The results of this study provide the following implications. First, to predict the return on investment for companies based on innovative technology, it is most important to capture the recognition stage in which public interest rapidly increases in social acceptance of the technology. Second, the change in keyword search volume and the accuracy of the prediction model varies according to the social acceptance of technology should be considered in developing a Decision Support System for investment such as the big data-based Robo-advisor recently introduced by the financial sector.

최근 주식의 수익률과 거래량을 설명하는 주요 요인으로서 투자자의 관심도와 주식 관련 정보 전파의 영향력이 부각되고 있다. 또한 인공지능과 같은 혁신 신기술을 개발보급하거나 활용하려는 기업의 경우 거시환경 및 시장 불확실성 때문에 기업의 미래 주식 수익률과 주식 변동성을 예측하기 어렵다는 문제를 가지고 있다. 이는 인공지능 활성화의 장애요인으로 인식되고 있다. 따라서 본 연구의 목적은 인공지능 관련 기술 키워드의 인터넷 검색량을 투자자의 관심 척도로 사용하여, 기업의 주가 변동성을 예측하는 기계학습 모형을 제안하는 것이다. 이를 위해 심층신경망 LSTM(Long Short-Term Memory)과 벡터자기회귀(Vector Autoregression)를 통해 주식시장을 예측하고, 기술의 사회적 수용 단계에 따라 키워드 검색량을 활용한 주가예측 성능 비교를 통해 기업의 투자수익 예측이나 투자자들의 투자전략 의사결정을 지원하는 주가 예측 모형을 구축하였다. 또한 인공지능 기술의 세부 하위 기술에 대한 분석도 실시하여 기술 수용 단계에 따른 세부 기술 키워드 검색량의 변화를 살펴보고 세부기술에 대한 관심도가 주식시장 예측에 미치는 영향을 살펴보았다. 이를 위해 본 연구에서는 인공지능, 딥러닝, 머신러닝 키워드를 선정하여, 2015년 1월 1일부터 2019년 12월 31일까지 5년간의 인터넷 주별 검색량 데이터와 코스닥 상장 기업의 주가 및 거래량 데이터를 수집하여 분석에 활용하였다. 분석 결과 인공지능 기술에 대한 키워드 검색량은 사회적 수용 단계가 진행될수록 증가하는 것으로 나타났고, 기술 키워드를 기반으로 주가예측을 하였을 경우 인식(Awareness)단계에서 가장 높은 정확도를 보였으며, 키워드별로 가장 좋은 예측 성능을 보이는 수용 단계가 다르게 나타남을 확인하였다. 따라서 기술 키워드를 활용한 주가 예측 모델 구축을 위해서는 해당 기술의 하위 기술 분류를 고려할 필요가 있다. 본 연구의 결과는 혁신기술을 기반으로 기업의 투자수익률을 예측하기 위해서는 기술에 대한 대중의 관심이 급증하는 인식 단계를 포착하는 것이 중요하다는 점을 시사한다. 또한 최근 금융권에서 선보이고 있는 빅데이터 기반 로보어드바이저(Robo-advisor) 등 투자 의사 결정 지원 시스템 개발 시 기술의 사회적 수용도를 세분화하여 키워드 검색량 변화를 통해 예측 모델의 정확도를 개선할 수 있다는 점을 시사하고 있다.

Keywords

References

  1. Agag, G. M., M. A. Khashan, and M. H. ElGayaar, "Understanding online gamers' intentions to play games online and effects on their loyalty: An integration of IDT, TAM and TPB", Journal of Customer Behaviour, Vol.18, No.2(2019), 101-130. https://doi.org/10.1362/147539219X15633616548597
  2. Alsmadi, I., M. Al-Ayyoub, M. Alsmirat, and Y. Jararweh, "Using popular search terms in stock price prediction", In 2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS), 279-285).
  3. Bank, M., M. Larch, and G. Peter, "Google search volume and its influence on liquidity and returns of German stocks", Financial Markets and Portfolio Management, Vol.25, No.3(2011), 239-264. https://doi.org/10.1007/s11408-011-0165-y
  4. Bong, K. T., and H. S. Lee, "Analysis and estimation for market share of biologics based on Google trends big data", Journal of the Society of Korea Industrial and Systems Engineering, Vol.43, No.2(2020), 14-24. https://doi.org/10.11627/jkise.2020.43.2.014
  5. Bordino, I., S. Battiston, G. Caldarelli, M. Cristelli, A. Ukkonen, and I. Weber, "Web search queries can predict stock market volumes", PloS One, Vol.7, No.7(2012), 1-17.
  6. Box, G. E., G. M. Jenkins, G. C. Reinsel, and G. M. Ljung, Time series analysis: forecasting and control, John Wiley & Sons, 2015.
  7. Boyd, R., and R. J. Holton, "Technology, innovation, employment and power: Does robotics and artificial intelligence really mean social transformation", Journal of Sociology, Vol.54, No.3(2018), 331-345. https://doi.org/10.1177/1440783317726591
  8. Cheong, J. H., and M. Park, "Mobile internet acceptance in Korea", Internet Research, Vol. 15 No. 2(2005), 125-40. https://doi.org/10.1108/10662240510590324
  9. Choi, H., and H. Varian, "Predicting the present with Google Trends". Economic record, Vol.88, (2012), 2-9. https://doi.org/10.1111/j.1475-4932.2012.00809.x
  10. Chung, M. S., and J. Y. Lee, "Systemic Analysis of Research Activities and Trends Related to Artificial Intelligence(A.I.) Technology Based on Latent Dirichlet Allocation (LDA) Model)", Journal of the Korea Industrial Information Systems Research , Vol.23, No.3(2018), 87-95. https://doi.org/10.9723/JKSIIS.2018.23.3.087
  11. Chung, M. S., S. H. Park, B. H. Chae, and J. Y. Lee, "Analysis of major research trends in artificial intelligence through analysis of thesis data", Journal of Digital Convergence, Vol.15, No.5(2017), 225-233. https://doi.org/10.14400/JDC.2017.15.5.225
  12. Da, Z., J. Engelberg, and P. Gao, "In search of attention", The Journal of Finance, Vol.66, No.5(2011), 1461-1499. https://doi.org/10.1111/j.1540-6261.2011.01679.x
  13. Ding, G., and L. Qin, "Study on the prediction of stock price based on the associated network model of LSTM", International Journal of Machine Learning and Cybernetics, Vol.11, No.6(2020), 1307-1317. https://doi.org/10.1007/s13042-019-01041-1
  14. Erumban, A. A., and S. B. De Jong, "Cross-country differences in ICT adoption: A consequence of Culture?", Journal of World Business, Vol.41, No.4(2006), 302-314. https://doi.org/10.1016/j.jwb.2006.08.005
  15. Gartner, Gartner's 2017 Hype Cycle for Artificial Intelligence, 2017. Available at: https://www.gartner.com/doc/3770467/hype-cycle-artificial-intelligence- (accessed 28 November 2020)
  16. Gartner, "Applying Artificial Intelligence to Drive Business Transformation: A Gartner Trend Insight Report", 2-7. 2018.
  17. Ginsberg, J., M. H. Mohebbi, R. S. Patel, L. Brammer, M. S. Smolinski, and L. Brilliant, "Detecting influenza epidemics using search engine query data", Nature, Vol.457, No.7232(2009), 1012-1014. https://doi.org/10.1038/nature07634
  18. Goel, H., I. Melnyk, N. Oza, B. Matthews, A. Banerjee, "Multivariate aviation time series modeling: VARs vs. LSTMs", In Proceedings of the SIAM International Conference on Data Mining (SDM), (2017), 27-29.
  19. Goel, S., J. M. Hofman, S. Lahaie, D. M. Pennock, and D. J. Watts, "Predicting consumer behavior with Web search", Proceedings of the National academy of sciences, Vol.107, No.41(2010), 17486-17490. https://doi.org/10.1073/pnas.1005962107
  20. Graves, A., Supervised sequence labelling with recurrent neural networks (pp. 37-45). Springer, Berlin, Heidelberg.2012
  21. Han, D. I. D., M. C. Tom Dieck, and T. Jung, "Augmented Reality Smart Glasses (ARSG) visitor adoption in cultural tourism", Leisure Studies, Vol.38, No.5(2019), 618-633. https://doi.org/10.1080/02614367.2019.1604790
  22. Harivigneshwar, C. J., K. B. Dharmavenkatesan, R. Ajith, and R. Jeyanthi, "Modeling of Multivariate Systems using Vector Autoregression (VAR)", In 2019 Innovations in Power and Advanced Computing Technologies (i-PACT), Vol.1, (2019), 1-6.
  23. Hoseinzade, E., and S. Haratizadeh, (2019). "CNNpred: CNN-based stock market prediction using a diverse set of variables", Expert Systems with Applications, Vol.129, (2019), 273-285. https://doi.org/10.1016/j.eswa.2019.03.029
  24. Huang, T. C., R. N. Zaeem, and K. S. Barber, "It is an equal failing to trust everybody and to trust nobody: Stock price prediction using trust filters and enhanced user sentiment on Twitter", ACM Transactions on Internet Technology (TOIT), Vol.19, No.4(2019), 1-20.
  25. Jang, S. H., "A Study on the Factors Influencing RFID Diffusion: In the Perspective of Innovation Diffusion Theory", Journal of the Korea society of computer and information, Vol.15, No.11(2010), 173-183. https://doi.org/10.9708/jksci.2010.15.11.173
  26. Jeon, S. M., Y. J. Chung, and D. Y. Lee, "The Relationship between Internet Search Volumes and Stock Price Changes: An Empirical Study on KOSDAQ Market", Journal of Intelligence and Information Systems, Vol.22, No.2(2016), 81-96. https://doi.org/10.13088/jiis.2016.22.2.081
  27. Kang, W. K., and B. R. Kim," Consideration of Human Emotions about Artificial Intelligence - Focused on the Analysis of Newspaper Articles on AlphaGo VS Lee Sedol", Journal of Korean Ethics Studies, Vol.1, No.123(2018), 181-201.
  28. Khashei, M., and Z. Hajirahimi, "A comparative study of series arima/mlp hybrid models for stock price forecasting", Communications in Statistics-Simulation and Computation, Vol.48, No.9(2019), 2625-2640. https://doi.org/10.1080/03610918.2018.1458138
  29. Kim, D. Y., J. W. Park, and J. H. Kim, "A Comparative Study between Stock Price Prediction Models Using Sentiment Analysis and Machine Learning Based on SNS and News Articles". Journal of Information Technology Services, Vol.13, No.3(2014), 211-233.
  30. Kim, E. C., and D. W. Lee, "A study on asset allocation strategy using Google trends", Journal of the Korean Data & Information Science Society, Vol.31, No.1(2020), 173-186. https://doi.org/10.7465/jkdi.2020.31.1.173
  31. Kim, H. Y., and C. H. Won, "Forecasting the volatility of stock price index: A hybrid model integrating LSTM with multiple GARCH-type models", Expert Systems with Applications, Vol.103, (2018), 25-37. https://doi.org/10.1016/j.eswa.2018.03.002
  32. Kim, M. S., and H. J. Kwon, "The Effect of Portal Search Intensity on Stock Price Crash", The Journal of Society for e-Business Studies, Vol.22, No.2(2017), 153-168.
  33. Kim, M. S., and P. H. Koo, "A Study on Big Data Based Investment Strategy Using Internet Search Trends", Journal of the Korean Operations Research and Management Science Society , Vol.38, No.4(2013), 53-63. https://doi.org/10.7737/JKORMS.2013.38.4.053
  34. Kim, R. M., "An Empirical Study on the Relation between Search Volume, Investors Trading, and Stock Returns", The Korean Journal of Financial Engineering, Vol.17, No.2(2018), 53-85. https://doi.org/10.35527/kfedoi.2018.17.2.003
  35. Ko, H. S., D. H. Park, and N. R. Lee, "Challenges of Establishing Ethics Principles and a Governance Regime for Artificial Intelligence", Journal of Law & Economic Regulation, Vol.13, No.1(2020), 7-36. https://doi.org/10.22732/CELPU.2020.13.1.7
  36. Lee, B. W., J. H. Kim, and J. P. Yu, "Forecasting Company Sales and Stock Price Using Google Trend: Focusing on the Keywords of BMW and Mercedes-Benz", Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology, Vol.8, No.10(2018), 491-501.
  37. Lee, G., and S. Youn, "Smart speaker market analysis and forecast using Google trends", KIISE Transactions on Computing Practices, Vol.24, No.11(2018), 596-602. https://doi.org/10.5626/ktcp.2018.24.11.596
  38. Liang, X., "Mining associations between web stock news volumes and stock prices", International Journal of Systems Science, Vol.37, No.13(2006), 919-930. https://doi.org/10.1080/00207720600891562
  39. Liu, P., J. Liu, and K. Wu, "CNN-FCM: System modeling promotes stability of deep learning in time series prediction", Knowledge-Based Systems, (2020), 106081. https://doi.org/10.1016/j.knosys.2020.106081
  40. Makridakis, S., "The forthcoming Artificial Intelligence (AI) revolution: Its impact on society and firms", Futures, Vol.90, (2017), 46-60. https://doi.org/10.1016/j.futures.2017.03.006
  41. Mondal, P., L. Shit, and S. Goswami, "Study of effectiveness of time series modeling (ARIMA) in forecasting stock prices", International Journal of Computer Science, Vol.4, No.2(2014), 13-29.
  42. Moon, K. S., "Vector Autoregressive Model: VAR", Journal of The Korean Official Statistics, Vol.2, No.1(1997), 23-56.
  43. Munkhdalai, L., M. Li, N. Theera-Umpon, S. Auephanwiriyakul, and K. H. Ryu, "VAR-GRU: A Hybrid Model for Multivariate Financial Time Series Prediction", In Asian Conference on Intelligent Information and Database Systems, (2020), 322-332.
  44. Naccarato, A., S. Falorsi, S. Lorig, and A. Pierini, "Combining official and Google Trends data to forecast the Italian youth unemployment rate", Technological Forecasting & Social Change, Vol.130, (2018), 114-122. https://doi.org/10.1016/j.techfore.2017.11.022
  45. Pai, P. F., L. C. Hong, and K. P. Lin, "Using internet search trends and historical trading data for predicting stock markets by the least squares support vector regression model", Computational Intelligence and Neuroscience, Vol.2018, (2018), 6305246. https://doi.org/10.1155/2018/6305246
  46. Park, S. U., "AI technology and market trends", The magazine of KIICE, Vol.19, No.2(2018), 11-22.
  47. Park, Y., and J. V. Chen, "Acceptance and adoption of the innovative use of smartphone", Industrial Management & Data Systems, Vol.107, No.9, (2007), 1349-1365. https://doi.org/10.1108/02635570710834009
  48. Paschek, D., C. T. Luminosu, and A. Draghici, "Automated business process management-in times of digital transformation using machine learning or artificial intelligence", In MATEC Web of Conferences 121, (2017), 04007.
  49. Pastor, L., and P. Veronesi, "Was there a Nasdaq bubble in the late 1990s?", Journal of Financial Economics, Vol.81, No.1(2006), 61-100. https://doi.org/10.1016/j.jfineco.2005.05.009
  50. Patel, J., S. Shah, P. Thakkar, and K. Kotecha, "Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques", Expert systems with applications, Vol.42, No.1(2015), 259-268. https://doi.org/10.1016/j.eswa.2014.07.040
  51. Polgreen, P. M., Y. Chen, D. M. Pennock, F. D. Nelson, and R. A. Weinstein, "Using internet searches for influenza surveillance", Clinical infectious diseases, Vol.47, No.11, (2008), 1443-1448. https://doi.org/10.1086/593098
  52. Preis, T., D. Reith, and H.E. Stanley, "Complex dynamics of our economic life on different scales:insights from search engine query data", Philosophical Transactions of the Royal Society, Vol.368, (2010), 5707-5719.
  53. Preis, T., H. S. Moat, and H. E. Stanley, " Quantifying trading behavior in financial markets using Google Trends,"Scientific Report, Vol.3, No.1, (2013), 1-5.
  54. Qian, F., and X. Chen, "Stock prediction based on lstm under different stability", In 2019 IEEE 4th International Conference on Cloud Computing and Big Data Analysis (ICCCBDA) ,483-486, (2019).
  55. Rather, A. M., A. Agarwal, and V. N. Sastry, "Recurrent neural network and a hybrid model for prediction of stock returns", Expert Systems with Applications, Vol.42, No.6(2015), 3234-3241. https://doi.org/10.1016/j.eswa.2014.12.003
  56. Rogers, E. M. Diffusion of innovations, 4th edition, Free Press, New York. 1995.
  57. Roondiwala, M., H. Patel, and S. Varma, "Predicting stock prices using LSTM", International Journal of Science and Research, Vol.6, No.4(2017), 1754-1756.
  58. Ruohonen, J., and S. Hyrynsalmi, "Evaluating the use of internet search volumes for time series modeling of sales in the video game industry", Electronic Markets, Vol.27, No.4(2017), 351-370. https://doi.org/10.1007/s12525-016-0244-z
  59. Sarode, S., H. G. Tolani, P. Kak, and C. S. Lifna, "Stock price prediction using machine learning techniques". In 2019 International Conference on Intelligent Sustainable Systems (ICISS) ,177-181, (2019).
  60. Shim, J. W., and S. G. Chae, "Seeking Possibility of Ethical Issues Based on Public Attitude Toward Artificial Intelligence Through Analysis of Social Network Data", The Journal of Humanities and Social science, Vol.10, No.3(2019), 1337-1347.
  61. Si, J., A. Mukherjee, B. Liu, S. J. Pan, Q. Li, and H. Li, "Exploiting social relations and sentiment for stock prediction", In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), (2014), 1139-1145.
  62. Siami-Namini, S., N. Tavakoli, and A. S. Namin, "A comparison of ARIMA and LSTM in forecasting time series", In 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), 1394-1401, (2018).
  63. Sohn, K., and O. Kwon, "Technology acceptance theories and factors influencing artificial Intelligence-based intelligent products", Telematics and Informatics, Vol.47, (2020), 101324. https://doi.org/10.1016/j.tele.2019.101324
  64. Solano, P., M. Ustulin, E. Pizzorno, M. Vichi, M. Pompili, G. Serafini, and M. Amore, "A Google-based approach for monitoring suicide risk", Psychiatry Research, Vol.246, (2016), 581-586. https://doi.org/10.1016/j.psychres.2016.10.030
  65. Suharsono, A., A. Aziza, and W. Pramesti, "Comparison of vector autoregressive (VAR) and vector error correction models (VECM) for index of ASEAN stock price", In AIP Conference Proceedings, Vol.1913, No.1(2017), 020032.
  66. Tiong, W. N., "Factors Influencing Behavioural Intention towards Adoption of Digital Banking Services in Malaysia", International Journal of Asian Social Science, Vol.10, No.8(2020), 450-457. https://doi.org/10.18488/journal.1.2020.108.450.457
  67. Yu, P., and X. Yan, "Stock price prediction based on deep neural networks. Neural Computing and Applications", Neural Computing and Applications, Vol.32, No.6(2020), 1609-1628. https://doi.org/10.1007/s00521-019-04212-x