Improving SVM Classification by Constructing Ensemble

앙상블 구성을 이용한 SVM 분류성능의 향상

  • 제홍모 (포항공과대학교 컴퓨터공학과) ;
  • 방승양 (포항공과대학교 컴퓨터공학과)
  • Published : 2003.04.01

Abstract

A support vector machine (SVM) is supposed to provide a good generalization performance, but the actual performance of a actually implemented SVM is often far from the theoretically expected level. This is largely because the implementation is based on an approximated algorithm, due to the high complexity of time and space. To improve this limitation, we propose ensemble of SVMs by using Bagging (bootstrap aggregating) and Boosting. By a Bagging stage each individual SVM is trained independently using randomly chosen training samples via a bootstrap technique. By a Boosting stage an individual SVM is trained by choosing training samples according to their probability distribution. The probability distribution is updated by the error of independent classifiers, and the process is iterated. After the training stage, they are aggregated to make a collective decision in several ways, such ai majority voting, the LSE(least squares estimation) -based weighting, and double layer hierarchical combining. The simulation results for IRIS data classification, the hand-written digit recognition and Face detection show that the proposed SVM ensembles greatly outperforms a single SVM in terms of classification accuracy.

Support Vector Machine(SVM)은 이론상으로 좋은 일반화 성능을 보이지만, 실제적으로 구현된 SVM은 이론적인 성능에 미치지 못한다. 주 된 이유는 시간, 공간상의 높은 복잡도로 인해 근사화된 알고리듬으로 구현하기 때문이다. 본 논문은 SVM의 분류성능을 향상시키기 위해 Bagging(Bootstrap aggregating)과 Boosting을 이용한 SVM 앙상블 구조의 구성을 제안한다. SVM 앙상블의 학습에서 Bagging은 각각의 SVM의 학습데이타는 전체 데이타 집합에서 임의적으로 일부 추출되며, Boosting은 SVM 분류기의 에러와 연관된 확률분포에 따라 학습데이타를 추출한다. 학습단계를 마치면 다수결 (Majority voting), 최소자승추정법(LSE:Least Square estimation), 2단계 계층적 SVM등의 기법에 개개의 SVM들의 출력 값들이 통합되어진다. IRIS 분류, 필기체 숫자인식, 얼굴/비얼굴 분류와 같은 여러 실험들의 결과들은 제안된 SVM 앙상블의 분류성능이 단일 SVM보다 뛰어남을 보여준다.

Keywords

References

  1. C. Cortes and V. Vapnik, Support vector nettwork, Machine Learning, vol. 20, 273-297, 1995 https://doi.org/10.1007/BF00994018
  2. Burges, C. J. C., 1998, 'A tutorial on Support Vector Machines for Pattern Recognition,' Data Mining and Knowledge Discovery, Vol. 2, pp. 121-167 https://doi.org/10.1023/A:1009715923555
  3. T. Joachims, Making large-scale support vector machine learning practical, Advances in Kernel Methods: Support Vector Machines, MIT Press, Cambridge, MA, 1999
  4. John Platt, Fast training of support vector machines using sequential minimal optimization, Advances in Kernel Methods: Support Vector Machines, MIT Press, Cambridge, MA, 1999
  5. L. Bottou, et. al., Comparison of classifier methods: a case study in handwriting digit recognition, Proceedings of International Conference on Pattern Recognition, IEEE Computer Society Press, pp. 77-87, 1994
  6. S. Knerr, L.Personnaz, G.Dreyfus, Single-layer learning revisited: a stepwise procedure for building and training a neural network, J.Fogelman, editor, Neurocomputing: Algorithms, Architectures and Application, Springer-Verlag, 1990
  7. Valdimir N. Vapnik, The Nature of Statistical Learning Theory, Springer, New York, 1999
  8. Thomas G. Dietterich, Machine Learning Research: Four Current Directions, The AI Magazine, vol. 18, no. 4, 97-136, 1998
  9. L. Hansen, P. Salamon, Neural network ensembles, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, 993-1001, 1990 https://doi.org/10.1109/34.58871
  10. L. Breiman, Bagging predictors, Machine Learning, vol. 24, no. 2, 123-140, 1996 https://doi.org/10.1023/A:1018054314350
  11. Robert E. Schapire, A Brief Introduction to Boosting, proc. 6th Init Joint Conf. On AI, pp.1-6, 1999
  12. Daijin Kim and Chul-Hyun Kim, Forecasting time series with genetic fuzzy predictor ensemble, IEEE Transaction on Fuzzy Systems, vol. 5, no. 4, 523-535, 1997 https://doi.org/10.1109/91.649903
  13. M. Jordan and R. Jacobs, Hierarchical mixtures of experts and the EM algorithm, Neural Computation, vol. 6, no. 5, 181-214, 1994
  14. R.A. Fisher, The use of multiple measurements in taxonomic problems, Annual Eugenics, 7, Part II, 179-188, 1936
  15. B. D. Bay, The UCI KDD Archive [http://kdd.ics.uci.edu]. Irvine, CA: University of California, Department of Information and Computer Science, 1999
  16. M. Haris, V. Ganapathy, Neural network ensemble for financial trend prediction, IEEE TENCON 2000, Proceedings, Volume: 3, 157-161, 2000 https://doi.org/10.1109/TENCON.2000.892242
  17. G.Giacinto, F. Roli, L. Bruzzone. Combination of neural and statistical algorithms. Pattern Recognition Letter 21, pp. 385-397, 2000 https://doi.org/10.1016/S0167-8655(00)00006-4
  18. http://nova.postech.ac.kr
  19. Richard O. Duda, Pattern Classification 2nd ED., 1997
  20. Christopher M. Bishop, Neural Networks for Pattern Recognition, Clarendon Press. Oxford, 1995