DOI QR코드

DOI QR Code

Prediction & Assessment of Change Prone Classes Using Statistical & Machine Learning Techniques

  • Malhotra, Ruchika (Dept. of Software Engineering, Delhi Technological University) ;
  • Jangra, Ravi (Dept. of Software Engineering, Delhi Technological University)
  • Received : 2013.11.21
  • Accepted : 2014.10.28
  • Published : 2017.08.31

Abstract

Software today has become an inseparable part of our life. In order to achieve the ever demanding needs of customers, it has to rapidly evolve and include a number of changes. In this paper, our aim is to study the relationship of object oriented metrics with change proneness attribute of a class. Prediction models based on this study can help us in identifying change prone classes of a software. We can then focus our efforts on these change prone classes during testing to yield a better quality software. Previously, researchers have used statistical methods for predicting change prone classes. But machine learning methods are rarely used for identification of change prone classes. In our study, we evaluate and compare the performances of ten machine learning methods with the statistical method. This evaluation is based on two open source software systems developed in Java language. We also validated the developed prediction models using other software data set in the same domain (3D modelling). The performance of the predicted models was evaluated using receiver operating characteristic analysis. The results indicate that the machine learning methods are at par with the statistical method for prediction of change prone classes. Another analysis showed that the models constructed for a software can also be used to predict change prone nature of classes of another software in the same domain. This study would help developers in performing effective regression testing at low cost and effort. It will also help the developers to design an effective model that results in less change prone classes, hence better maintenance.

Keywords

References

  1. H. Lu, Y. Zhou, B. Xu, H. Leung, and L. Chen, "The ability of object-oriented metrics to predict changeproneness: a meta-analysis," Empirical Software Engineering, vol. 17, no. 3, pp. 200-242, 2012. https://doi.org/10.1007/s10664-011-9170-z
  2. K. K. Aggarwal, Y. Singh, A. Kaur, and R. Malhotra, "Empirical analysis for investigating the effect of objectoriented metrics on fault proneness: a replicated case study," Software Process: Improvement and Practice, vol. 14, no. 1, pp. 39-62, 2009. https://doi.org/10.1002/spip.389
  3. Y. Singh, A. Kaur, and R. Malhotra, "Empirical validation of object-oriented metrics for predicting fault proneness," Software Quality Journal, vol. 18, no. 1, pp. 3-35, 2010. https://doi.org/10.1007/s11219-009-9079-6
  4. L. C. Briand, J. Wust, J. W. Daly, and D. Victor Porter, "Exploring the relationships between design measures and software quality in object-oriented systems," Journal of Systems and Software, vol. 51, no. 3, pp. 245-273, 2000. https://doi.org/10.1016/S0164-1212(99)00102-8
  5. M. Cartwright and M. Shepperd, "An empirical investigation of an object-oriented software system," IEEE Transactions on Software Engineering, vol. 26, no. 8, pp. 786-796, 2000. https://doi.org/10.1109/32.879814
  6. M. English, C. Exton, I. Rigon, and B. Clearyp, "Fault detection and prediction in an open source software project," in Proceedings of the 5th International Conference on Predictor Models in Software Engineering (PROMISE2009), Vancouver, Canada, 2009.
  7. R. Malhotra and A. Jain, "Software effort prediction using statistical and machine learning method," International Journal of Advanced Computer Science and Applications, vol. 2, no. 1, pp. 145-152, 2011.
  8. A. Han, S. Jeon, D. Bae, and J. Hong, "Behavioral dependency measurement for change proneness prediction in UML 2.0 design models," in Proceedings of the 32nd Annual IEEE International Conference on Computer Software and Applications (COMPSAC2008), Turku, Finland, 2008, pp. 76-83.
  9. M. Ajrnal Chaumum, H. Kabaili, R. K. Keller, and F. Lustman, "A change impact model for changeability assessment in object oriented software systems," in Proceedings of the 3rd European Conference on Software Maintenance and Reengineering, Amsterdam, 1999, pp. 130-138.
  10. N. Tsantalis, A. Chatzigeorgiou, and G. Stephanides, "Predicting the probability of change in object oriented systems," IEEE Transactions on Software Engineering, vol. 31, no. 7, pp. 601-614, 2005. https://doi.org/10.1109/TSE.2005.83
  11. A. R. Sharafat and L. Tavildari, "A probabilistic approach to predict change in object-oriented software systems," in Proceedings of the 11th European Conference on Software Maintenance and Reengineering (CSMR2007), Amsterdam, 2007, pp. 27-38.
  12. Y. Zhou, H. Leung, and B. Xu, "Examining the potentially confounding effect of class size on the associations between object-oriented metrics and change-proneness," IEEE Transactions on Software Engineering, vol. 35, no. 5, pp. 607-623, 2009. https://doi.org/10.1109/TSE.2009.32
  13. L. Malhotra and A. J. Bansal, "Prediction of change prone classes using machine learning and statistical techniques," in Advanced Research and Trends in New Technologies, Software, Human-Computer Interaction and Communicability. Hershey, PA: IGI Global, 2014, pp. 193-202.
  14. Understand your code official website [Online]. Available: http://www.scitools.com.
  15. Understand your code - About SciTools [Online]. Available: http://www.scitools.com/about/index.php.
  16. Understand your code - Metrics [Online]. Available: https://scitools.com/feature/metrics/.
  17. S. R. Chidamber and C. F. Kemerer, "A metrics suite for object oriented design," IEEE Transactions on Software Engineering, vol. 20, no. 6, pp. 476-493, 1994. https://doi.org/10.1109/32.295895
  18. K. K. Aggarwal, Y. Singh, A. Kaur, and R. Malhotra, "Empirical study of object-oriented metrics," Journal of Object Technology, vol. 5, no. 8, pp. 149-173, 2006. https://doi.org/10.5381/jot.2006.5.8.a5
  19. B. Henderson-Sellers, Object-Oriented Metrics: Measures of Complexity. Englewood Cliffs, NJ: Prentice Hall, 1996.
  20. M. Lorenz and J. Kidd, Object-oriented Software Metrics: A Practical Guide. Englewood Cliffs, NJ: Prentice Hall, 1994.
  21. Understand your code - County metrics [Online]. Available: https://scitools.com/support/metrics_list.
  22. Understand your code - Complexity metrics [Online]. Available: https://scitools.com/support/metrics_list.
  23. Understand your code - Object oriented metrics [Online]. Available: https://scitools.com/support/ metrics_list.
  24. S. R. Chidamber, D. P. Darcy, and C. F. Kemerer, "Managerial use of metrics for object-oriented software: an exploratory analysis," IEEE Transactions on Software Engineering, vol. 24, no. 8, pp. 629-639, 1998. https://doi.org/10.1109/32.707698
  25. V. R. Basili, L. C. Briand and W. L. Melo, "A validation of object-oriented design metrics as quality indicators," IEEE Transactions on Software Engineering, vol. 22, no. 10, pp. 751-761, 1996. https://doi.org/10.1109/32.544352
  26. D. W. Hosmer and S. Lemeshow, Applied Logistic Regression. New York, NY: Wiley, 1989.
  27. Weka 3: Data Mining Software in Java [Online]. Available: http://www.cs.waikato.ac.nz/ml/weka/.
  28. M. A. Hall, "Correlation based feature selection for discrete and numeric class machine learning," in Proceedings of the 17th International Conference on Machine Learning (ICML2000), Stanford, CA, 2000, pp. 359-366.
  29. K. Michalak and H. Kwasnicka, "Correlation based feature selection strategy in neural classification," in Proceedings of the 6th International Conference on Intelligent Systems Design and Applications (ISDA2006), Jinan, China, 2006, pp.741-746.
  30. I. H. Witten, E. Frank, and M. A. Hall, Data Mining: Practical Machine Learning Tools and Techniques, 3rd ed. Burlington, MA: Morgan Kaufmann, 2011.
  31. Y. Freund, R. E. Schapire, and N. Abe, "A short introduction to boosting," Journal of Japanese Society for Artificial Intelligence, vol. 14, no. 5, pp. 771-780, 1999.
  32. AdaBoost [Online]: Available: http://en.wikipedia.org/wiki/AdaBoost.
  33. L. Breiman, "Bagging predictors," Machine Learning, vol. 24, no. 2, pp. 123-140, 1996. https://doi.org/10.1023/A:1018054314350
  34. S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd ed. Delhi: Pearson education, 1999.
  35. J. Pearl, Bayesian Networks, 1988 [Online]: Available: http://ftp.cs.ucla.edu/pub/stat_ser/R246.pdf.
  36. Bayes Nets [Online]. Available: http://www.bayesnets.com/.
  37. K. P. Murphy, "Naive Bayes classifiers," Technical Report, University of British Columbia, Canada, 2006.
  38. C4.5 algorithm [Online]. Available: http://en.wikipedia.org/wiki/C4.5_algorithm.
  39. J. Friedman, T. Hastie, and R. Tibshirani, "Additive logistic regression: a statistical view of boosting," The Annals of Statistics, vol. 28, no. 2, pp. 337-407, 2000. https://doi.org/10.1214/aos/1016218223
  40. M. Stone, "Cross-validatory choice and assessment of statistical predictions," Journal of the Royal Statistical Society Series B: Methodological, vol. 36, no 2, pp. 111-147, 1974.
  41. K. El-Emam, S. Benlarbi, N. Goel, and SN. Rai, "A validation of object-oriented metrics," Technical Report ERB- 1063, National Research Council of Canada, 1999.
  42. K. K. Aggarwal and Y. Singh, Software Engineering, 2nd ed. New Delhi: New Age International Publishers, 2007.
  43. A. G. Koru and J. Tian, "Comparing high-change modules and modules with the highest measurement values in two large-scale open-source products," IEEE Transactions on Software Engineering, vol. 31, no. 8, pp. 625-642, 2005. https://doi.org/10.1109/TSE.2005.89
  44. L. C. Briand, J. Daly, and J. Wust, "A unified framework for cohesion measurement in object-oriented systems," Empirical Software Engineering, vol. 3, no. 1, pp. 65-117, 1998. https://doi.org/10.1023/A:1009783721306
  45. L. C. Briand, J. Daly, and J. Wust, "A unified framework for coupling measurement in object-oriented systems," IEEE Transactions on Software Engineering, vol. 25, no. 1, pp. 91-121, 1999. https://doi.org/10.1109/32.748920
  46. L. C. Briand, J. Wust, and H. Lounis, "Replicated case studies for investigating quality factors in object-oriented designs," Empirical Software Engineering, vol. 6, no. 1, pp. 11-58, 2001. https://doi.org/10.1023/A:1009815306478

Cited by

  1. A state-of-the-art survey of malware detection approaches using data mining techniques vol.8, pp.1, 2018, https://doi.org/10.1186/s13673-018-0125-x
  2. Reliable fault diagnosis of bearings with varying rotational speeds using envelope spectrum and convolution neural networks vol.22, pp.20, 2018, https://doi.org/10.1007/s00500-018-3256-0
  3. Advanced algorithms and applications based on IoT for the smart devices vol.9, pp.4, 2018, https://doi.org/10.1007/s12652-018-0715-5