Evaluations of predicted models fitted for data mining - comparisons of classification accuracy and training time for 4 algorithms

데이터마이닝기법상에서 적합된 예측모형의 평가 -4개분류예측모형의 오분류율 및 훈련시간 비교평가 중심으로

  • Lee, Sang-Bock (Department of Information Statistics, Catholic University of Daegu)
  • 이상복 (대구가톨릭대학교 응용과학부)
  • Published : 2001.10.30

Abstract

CHAID, logistic regression, bagging trees, and bagging trees are compared on SAS artificial data set as HMEQ in terms of classification accuracy and training time. In error rates, bagging trees is at the top, although its run time is slower than those of others. The run time of logistic regression is best among given models, but there is no uniformly efficient model satisfied in both criteria.

의사결정나무모형 가운데 하나인 CHAID, 로지스틱 회귀모형, 이들을 이용한 각각의 베깅모형 등 4가지 예측분류모형에 대한 오분류율과 훈련시간을 표본크기별로 계산하고, 이들 모형에 대한 모의실험 비교를 통하여 주어진 알고리즘들의 효율성을 평가하였다. 베깅 의사결정나무모형은 오분류율은 낮았으나 상대적으로 훈련시간이 가장 길었다.

Keywords

References

  1. Proceedings of the Spring Conference 의사결정나무를 활용한 데이터 마이닝 예측모형 해석 강현철;한상태;최종후
  2. SAS Enterprise Miner를 이용한 데이터마이닝, 방법론 및 활용 강현철;한상태;최종후;김창용;김은석;김미경
  3. 4th International Conference on Knowledge Discovery and Data Mining, 1998, KDD98 A Comparison of Leading Data Mining Tools Elder, J. F.;Abbott, D.W.
  4. Machine Learning v.24 Bagging Predictors Breiman, L.
  5. Machine Learning v.40 no.3 Randomizing Outputs to Increase Prediction Accuracy Breiman, L.
  6. Machine Learning v.40 no.2 An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Tress: Bagging, Boosting, and Randomization Dietterich, T.G.
  7. Construction and Assessment of Classification Rules Hand, D. J.
  8. G.V. Applied Statistics v.29 no.2 An Exploratory Technique for investigating large quantities of categorical data Kass, G. V.
  9. Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining v.2 Reports from KDD-2000, KDD-Cup 2000 Organizer's Report: Peeling the Onion, R. SIGKDD Explorations Kohavi;Brodley, C. E.;Frasca, B.;Mason, L.;Zheng, Z.
  10. Machine Learning v.40 A comparison of prediction accuracy, complexity, and training time of thirty three old and new classification algorithms Lim, T. S.;Loh, W. Y.;Shih, Y. S.