DOI QR코드

DOI QR Code

Breast Cancer Classification using Deep Learning-based Ensemble

딥러닝 기반 앙상블을 이용한 유방암 분류

  • Choi, Do-Yeon (Department of Statistics, Busan National University) ;
  • Jeong, Kwang-Mo (Department of Statistics, Busan National University) ;
  • Lim, Dong Hoon (Department of Information Statistics and RINS, Gyeongsang National University)
  • Received : 2018.02.25
  • Accepted : 2018.05.24
  • Published : 2018.05.31

Abstract

Objectives: We propose a deep learning-based ensemble for improving breast cancer classification and compare it with existing six models including deep neural network on two UCI data. Methods: We propose a deep learning-based stacking ensemble method. We first applied five classifications methods individually, which were k-nearest neighbor, decision trees, support vector machines, discriminant analysis, and logistic regression analysis and then adopt a deep learning to the predictions derived from these methods after using 5-fold cross validation technique. We compared the proposed deep learning-based ensemble method with these methods for two UCI data through classification accuracy, ROC curves and c-statistics. Results: Experimental results for two UCI data showed that the proposed deep learning-based ensemble outperformed single k-nearest neighbor, decision trees, support vector machines discriminant analysis, and logistic regression analysis as well as deep neural network in terms of various performance measures. Conclusions: We proposed deep learning-based ensemble for improving breast cancer classification. The deep learning-based ensemble outperformed existing single models for all applications in terms of various performance measures.

Keywords

Acknowledgement

Supported by : National Research Foundation of Korea (NRF)

References

  1. National Cancer Center. Annual report of cancer statistics in Korea in 2015. Sejong: Ministry of Health and Welfare; 2017 (Korean).
  2. Sewak M, Vaidya P, Chan CC, Duan ZH. SVM approach to breast cancer classification. Conference: Second International Multi-Symposiums on Computer and Computational Sciences (IMSCCS 2007); 2007, p. 32-37.
  3. Fiuzy M, Haddadnia J, Mollania N, Hashemian M, Hassanpour K. Cancer based on fine needle aspiration (FNA) test data and combining intelligent systems. Iran J Cancer Prev 2012;5(4):169-177.
  4. Yuan Y, Shaw M. Induction of fuzzy decision trees. Fuzzy Sets System 1995;69(2):125-139. https://doi.org/10.1016/0165-0114(94)00229-Z
  5. Vapnik VN. The nature of statistical learning theory. New York: John Wiley & Sons; 1996.
  6. Zhang GP. Neural networks for classification: A Survey, IEEE Transactions on Systems, Man and Cybernetics-Part C. Applications Reviews 2000;30(4):451-462.
  7. Gupta S, Kumar D, Sharma A. Date mining classification techniques applied for breast cancer diagnosis and prognosis. Indian J Comput Sci Eng 2011;2(2):188-195.
  8. Kitbumrungrat K. Comparison logistic regression and discriminant analysis in classification groups for breast cancer. Int J Comput Sci Network Secur 2012;12(5):111-115.
  9. Xiao Y, Wu J, Lin Z, Zhao X. A deep learning-based multi-model ensemble method for cancer prediction. Comput Methods Programs Biomed 2018;153:1-9. https://doi.org/10.1016/j.cmpb.2017.09.005
  10. Lim JS, Oh YS, Lim DH. Bagging support vector machine for improving breast cancer classification. J Health Info Stat 2014;39(1):15-24 (Korean).
  11. Salunkhe UR, Mali SN. Classifier ensemble design for imbalanced data classification: a hybrid approach. Procedia Comput Sci 2016;85: 725-732. https://doi.org/10.1016/j.procs.2016.05.259
  12. UCI Machine Learning Repository. University of California, Center for Machine Learning and Intelligent Systems. Available at http://archive. ics.uci.edu/ml/datasets.html [accessed on October 13, 2017].
  13. Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and regression trees. London, UK: Chapman & Hall/CRC; 1984.
  14. Fisher RA. The use of multiple measurements in taxonomic problems. Ann Eugen 1936;7:111-132.
  15. Egan J. Signal decision theory and ROC analysis. Cambridge, MA: Academic Press; 1975.
  16. Cook NR. Statistical evaluation of prognostic versus diagnostic models: beyond the ROC curve. Clin Chem 2008;54(1):17-23. https://doi.org/10.1373/clinchem.2007.096529
  17. Landry M. Machine learning with R and H2O. Mountain View, CA: H2O.ai, Inc.; 2018.
  18. Kuhn M. Building predictive models in R using the caret package. J Stat Softw 2008;28(5):1-26.

Cited by

  1. 앙상블 딥러닝을 이용한 초음파 영상의 간병변증 분류 알고리즘 vol.20, pp.4, 2020, https://doi.org/10.7236/jiibc.2020.20.4.101
  2. 앙상블 딥러닝을 이용한 초음파 영상의 간병변증 분류 알고리즘 vol.20, pp.4, 2020, https://doi.org/10.7236/jiibc.2020.20.4.101