DOI QR코드

DOI QR Code

Feature Selection by Genetic Algorithm and Information Theory

유전자 알고리즘과 정보이론을 이용한 속성선택

  • 조재훈 (충북대학교 전기전자컴퓨터공학부) ;
  • 이대종 (충북대학교 BK2l 충북정보기술사업단) ;
  • 송창규 (충북대학교 BK2l 충북정보기술사업단) ;
  • 김용삼 (충북대학교 전기전자컴퓨터공학부) ;
  • 전명근 (충북대학교 전기전자컴퓨터공학부)
  • Published : 2008.02.25

Abstract

In the pattern classification problem, feature selection is an important technique to improve performance of the classifiers. Particularly, in the case of classifying with a large number of features or variables, the accuracy of the classifier can be improved by using the relevant feature subset to remove the irrelevant, redundant, or noisy data. In this paper we propose a feature selection method using genetic algorithm and information theory. Experimental results show that this method can achieve better performance for pattern recognition problems than conventional ones.

속성선택 (Feature Selection)은 패턴분류 문제에서 분류기들의 성능을 향상시킬 수 있는 중요한 기법이다. 특히, 많은 속성들을 가지는 데이터의 분류문제에서 관련이 적은 데이터, 중복되거나 또는 노이즈 있는 데이터를 제거한 주요 속성부분집합을 선택하여 이용함으로써 분류기의 정확도를 향상시킬 수 있다. 본 논문에서는 유전자 알고리즘과 정보이론의 상호정보량을 이용하여 속성선택을 하는 기법을 제안하였다. 실험을 통하여 제안된 알고리즘이 패턴인식문제에서 다른 방법들보다 성능이 우수함을 보였다.

Keywords

References

  1. M. Dash and H. Liu, "Feature selection for classification," Intell. Data Anal., vol. 1, no. 3, pp. 131-156, 1997 https://doi.org/10.1016/S1088-467X(97)00008-5
  2. P. M. Narendra and K. Fukunaga, "A branch and bound algorithm for feature selection," IEEE Trans. Comput., vol. C-26, no. 9, pp. 917-922, Sep. 1977 https://doi.org/10.1109/TC.1977.1674939
  3. N. Kwak and C.-H. Choi, "Input feature selection for classification problems," IEEE Trans. Neural Netw., vol. 13, no. 1, pp. 143-159, Jan. 2002 https://doi.org/10.1109/72.977291
  4. A. N. Mucciardi and E. E. Gose, "A comparison of seven techniquesfor choosing subsets of pattern recognition," IEEE Trans. Comput., vol. C-20, pp. 1023-1031, Sep. 1971 https://doi.org/10.1109/T-C.1971.223398
  5. M. Dash and H. Liu, "Consistency-based search in feature selection," Artif. Intell., vol. 151, pp. 155-176, 2003 https://doi.org/10.1016/S0004-3702(03)00079-1
  6. Battiti, R., "Using mutual information for selecting features in supervised neural net learning", IEEE Trans. Neural Networks, vol. 5, no. 4, pp. 537-550, 1994 https://doi.org/10.1109/72.298224
  7. K. Z. Mao, "Feature subset selection for support vector machines through discriminative function pruning analysis," IEEE Trans. Syst., Man, Cybern. B, vol. 34, no. 1, pp. 60-67, Feb. 2004 https://doi.org/10.1109/TSMCB.2002.805808
  8. Chun-Nan Hsu, Hung-Ju Huang, and Dietrich Schuschel, "The ANNIGMA-Wrapper Approach to Fast Feature Selection for Neural Nets," IEEE Trans. on Syst. man and Cybernetics-PART B: CYBERNETICS, vol. 32, no. 2, 2002
  9. N. R. Pal and K. Chintalapudi, "A connectionist system for feature selection," Neural, Parallel, and Sci. Comput., vol. 5, pp. 359-381, 1997
  10. M. Kudo and J. Sklansky, "Comparison of algorithms that select features for pattern classifiers," Patt. Recognit., vol. 33, pp. 25-41, 2000 https://doi.org/10.1016/S0031-3203(99)00041-2
  11. W. Siedlecki and J. Sklansky, "A note on genetic algorithms for largescale feature selection," Patt. Recognit. Lett., vol. 10, pp. 335-347, 1989 https://doi.org/10.1016/0167-8655(89)90037-8
  12. N. R. Pal, S. Nandi, and M. K. Kundu, "Self-crossover: A new genetic operator and its application to feature selection," Int. J. Syst. Sci., vol. 29, no. 2, pp. 207-212, 1998 https://doi.org/10.1080/00207729808929513
  13. C. J. Merz and P. M. Murphy. UCI repository of machine learning databases. Dept. Computer Science, Univ. California, Irvine. Online available : http://www.ics.uci. edu/~mlearn/MLRepository.html
  14. F. Tan, X. Fu, Y. Zhang and Anu G. Bourgeois, "Improving Feature Subset Selection Using a Genetic Algorithm for Microarray Gene Expression Data", IEEE Congress on Evolutionary Computation, pp. 2529-2534, 2006
  15. J. J Aguilera, M chica, M. J. del Jesus and F. Herrera, "Niching genetic feature selection algorithms applied to the design of fuzzy rule-based classification systems", IEEE International conference on Fuzzy Systems Fuzz-IEEE2007, pp. 1-6, 2007

Cited by

  1. Efficient Feature Selection Based Near Real-Time Hybrid Intrusion Detection System vol.5, pp.12, 2016, https://doi.org/10.3745/KTCCS.2016.5.12.471