DOI QR코드

DOI QR Code

Time-Series Data Prediction using Hidden Markov Model and Similarity Search for CRM

CRM을 위한 은닉 마코프 모델과 유사도 검색을 사용한 시계열 데이터 예측

  • Published : 2009.05.31

Abstract

Prediction problem of the time-series data has been a research issue for a long time among many researchers and a number of methods have been proposed in the literatures. In this paper, a method is proposed that similarities among time-series data are examined by use of Hidden Markov Model and Likelihood and future direction of the data movement is determined. Query sequence is modeled by Hidden Markov Modeling and then the model is examined over the pre-recorded time-series to find the subsequence which has the greatest similarity between the model and the extracted subsequence. The similarity is evaluated by likelihood. When the best subsequence is chosen, the next portion of the subsequence is used to predict the next phase of the data movement. A number of experiments with different parameters have been conducted to confirm the validity of the method. We used KOSPI to verify suggested method.

시계열의 예측에 대한 문제는 오랫동안 많은 연구자들의 연구의 대상이었으며 예측을 위한 많은 방법이 제안되었다. 본 논문에서는 은닉 마코프 모델(Hidden Markov Model)과 우도(likelihood)를 사용한 유사도 검색을 통하여 향후 시계열 데이터의 운행 방향을 예측하는 방법을 제안한다. 이전에 기록된 시계열 데이터에서 질의 시퀸스(sequence)와 유사한 부분을 검색하고 유사 부분의 서브 시퀸스를 사용하여 시계열을 예측하는 방법이다. 먼저 주어진 질의 시퀸스에 대한 은닉 마코프 모델을 작성한다. 그리고 시계열 데이터에서 순차적으로 일정 길이의 서브 시퀸스를 추출하고 추출된 서브 시퀸스와 작성된 은닉 마코프 모델과의 우도를 계산한다. 시계열 데이터로부터 추출된 서브 시퀸스 중에서 우도가 가장 높은 시퀸스를 유사 시퀸스로 결정하고 결정된 부분 이후의 값을 추출하여 질의 시퀸스 이후의 예측 값을 추정한다. 실험 결과 예측 값과 실제 값이 상당한 유사성을 나타내었다. 제안된 방법의 유효성은 코스피(KOSPI) 종합주가지수를 대상으로 실험하여 검증한다.

Keywords

References

  1. A. Sorjamaa, et al., "Methodology for long-term prediction of time series," Neurocomputing, Vol. 70, No. 16-18, pp.2861-2869, Oct, 2007. https://doi.org/10.1016/j.neucom.2006.06.015
  2. A. Sorjamaa, J. Hao, A. lendasse, "Mutual Information and k-Nearest Neighbors Approximator for Time Series Prediction," International Conference on Artificial Neural Networks, Vol. 3697, pp.553-558, 2005 Sep.
  3. S. Singh, "Pattern Modelling in time-series forecasting," Cybernetics and Systems-An International Journal, Vol. 31, No. 1, pp.49-66, 2000. https://doi.org/10.1080/019697200124919
  4. C. P. Papageorgiou, "High Frequency Time Series Analysis and Prediction using Markov Models," in Prodeedings of the conference on Computational Intlligence for Financial, pp.182-185, Mar. 1997.
  5. N. G. Pavlidis, D. K. Tasoulis, M. N. Vrahatis, "Time Series Forecasting Methodology for Multiple-Step-Ahead Prediction," The IASTED International Conference on Computational Intelligence, pp.456-461, 2005.
  6. C. Chatfield, "Time Series Forecasting with Neural Networks," Neural Networks for signal Processing VIII, pp.419-427, 31 Aug -2 Sept. 1998.
  7. P. Cortez, M. Rocha, J. Machadeo, J. Neves, "A Neural Network Based Time Series Forecasting System," IEEE International Conference on Neural Networks, Proceedings Vol.5, pp.2689-2693, Nov, 1995.
  8. Y. Chen, B. Yang, J. Dong, A. Abraham, "Time-series forecasting using flexible neural tree model", Information Sciences : an International Journal, Vol. 174, No. 3-4, pp.219-235, Aug. 2005. https://doi.org/10.1016/j.ins.2004.10.005
  9. D. Zhang, X. Ning, X. Liu, Y.Han, "NonLinear Time Series Forecasting with Dynamic RBF Neural Network," Proceeding of the 7th World COngress on Intelligent Control And Automation, pp.6988-6993, Chongqing. China. Jun. 2008.
  10. J. Hamaker and J. Zhao, "Bayesian Information criterion for automatic model selection," Technical Reprot, Mississippi State University, May 1999.
  11. M. Azzouzi, I. T. Nabney, "Analysing time series structure with Hidden Markov Models," in Proceeding of Neural Network for Signal Processing VIII, pp.402-408, 31 Aug -2 Sept. 1998.
  12. A. Panuccio, M. Bicego and V. Murino, "A Hidden Markov Model-based approach to sequential data clustering," In Caelli, T., Amin, A., Duin, R., Kamel, M., de Ridder, D., eds.: Structural, Syntactic and Statistical Pattern Recognition. LNCS 2396, Springer pp. 734-742, 2002.
  13. C. Bahlmann, H. Burkhardt, "Measuring Hmm Similarity with the Bayes Probability of Error and its Application to Online Handwriting Recognition," In Proc. of the 6th ICDAR, pp.406-411, 2001.