DOI QR코드

DOI QR Code

Optimal Threshold from ROC and CAP Curves

ROC와 CAP 곡선에서의 최적 분류점

  • Hong, Chong-Sun (Department of Statistics, Sungkyunkwan University) ;
  • Choi, Jin-Soo (Research Institute of Applied Statistics, Sungkyunkwan University)
  • 홍종선 (성균관대학교 통계학) ;
  • 최진수 (성균관대학교 응용통계연구소)
  • Published : 2009.10.31

Abstract

Receiver Operating Characteristic(ROC) and Cumulative Accuracy Profile(CAP) curves are two methods used to assess the discriminatory power of different credit-rating approaches. The points of optimal classification accuracy on an ROC curve and of maximal profit on a CAP curve can be found by using iso-performance tangent lines, which are based on the standard notion of accuracy. In this paper, we offer an alternative accuracy measure called the true rate. Using this rate, one can obtain alternative optimal threshold points on both ROC and CAP curves. For most real populations of borrowers, the number of the defaults is much less than that of the non-defaults, and in such cases the true rate may be more efficient than the accuracy rate in terms of cost functions. Moreover, it is shown that both alternative scores of optimal classification accuracy and maximal profit are the identical, and this single score coincides with the score corresponding to Kolmogorov-Smirnov statistic used to test the homogeneous distribution functions of the defaults and non-defaults.

신용평가 연구에서 부도와 정상차주에 대한 판별력을 평가하는 방법으로 Receiver Operating Characteristic(ROC)와 Cumulative Accuracy Profile(CAP) 곡선을 사용한다. ROC 곡선에서 최적의 분류정확도를 갖는 분류점과 CAP 곡선에서 최대의 이익을 나타내는 분류점은 일반적인 정확도의 개념으로 정의된 동일한 성과를 가진 접선을 사용하여 구한다. 본 연구에서는 정확도의 대안적인 측도로 진실율을 제안하고, 이 진실율을 이용하여 ROC와 CAP 곡선에서 대안적인 최적의 분류점을 구한다. 대부분 실제 차주의 모집단에서 부도차주는 정상차주보다 훨씬 수가 적다. 이러한 경우에 진실율은 정확도보다 비용함수의 측면에서 더욱 효율적일 수 있다. 진실율을 이용하여 최적의 분류정확도를 나타내는 분류점과 최대의 이익을 의미하는 분류점에 대응하는 스코어는 동일하다는 것을 보였으며, 이 스코어는 부도와 정상 차주의 분포함수의 동일성을 검정하는 Kolmogorov-Smirnov 통계량에 대응하는 스코어와도 일치하는 것을 발견하였다.

Keywords

References

  1. Berry, M. J. A. and Linoff, G. (1999). Data Mining Techniques: For Marketing, Sales, and Customer Support, Morgan Kaufmann Publishers
  2. Centor, R. M. (1991). Signal detectability: The use of ROC curves and their analyses, Medical Decision Making
  3. Egan, J. P. (1975). Signal Detection Theory and ROC Analysis, Academic Press, New York
  4. Fawcett, T. (2003). ROC Graphs: Notes and Practical Considerations for Data Mining Researchers, HP Laboratories, 1501 Page Mill Road, Palo Alto, CA 94304
  5. Hanley, A. and McNeil, B. (1982). The meaning and use of the area under a receiver operating characteristics (ROC) curve, Diagnostic Radiology, 143, 29-36
  6. Provost, F. and Fawcett, T. (1997). Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions, KDD-97
  7. Provost, F. and Fawcett, T. (2001). Robust classification for imprecise environments, Machine Learning, 42, 203-231 https://doi.org/10.1023/A:1007601015854
  8. Sobehart, J. R., Keenan, S. C. and Stein, R. M. (2000). Benchmarking quantitative default risk models: A validation methodology, Moodys Investors Service
  9. Sobehart, J. R. and Keenan, S. C. (2001). Measuring Default Accurately, Credit Risk Special Report, Risk, 14, March, 31-33
  10. Stein, R. M. (2005). The relationship between default prediction and lending profits: Integrating ROC analysis and loan pricing, Journal of Banking and Finance, 29, 1213-1236 https://doi.org/10.1016/j.jbankfin.2004.04.008
  11. Swets, J. (1988). Measuring the accuracy of diagnostic systems, Science 240, 1285-1293 https://doi.org/10.1126/science.3287615
  12. Swets, J. A., Dawes, R. M. and Monahan, J. (2000). Better decisions through science, Scientific American, 283, 82-87
  13. Tasche, D. (2006). Validation of internal rating systems and PD estimates, arXiv:physics/0606071, 1
  14. Vuk, M. and Curk, T. (2006). ROC Curve, Lift Chart and Calibration Plot, Metodolo ski zvezki, 3, 89-108
  15. Zou, K. H. (2002). Receiver operating characteristic(ROC) literature research, On-line bibliography available from: http://www.spl.harvard.edu/pages/ppl/zou/roc.html

Cited by

  1. VUS and HUM Represented with Mann-Whitney Statistic vol.22, pp.3, 2015, https://doi.org/10.5351/CSAM.2015.22.3.223
  2. AROC Curve and Optimal Threshold vol.24, pp.1, 2011, https://doi.org/10.5351/KJAS.2011.24.1.185
  3. Test for Theory of Portfolio Diversification vol.24, pp.1, 2011, https://doi.org/10.5351/KJAS.2011.24.1.001
  4. ROC Function Estimation vol.24, pp.6, 2011, https://doi.org/10.5351/KJAS.2011.24.6.987
  5. Parameter estimation of linear function using VUS and HUM maximization vol.26, pp.6, 2015, https://doi.org/10.7465/jkdi.2015.26.6.1305
  6. Parameter estimation for the imbalanced credit scoring data using AUC maximization vol.29, pp.2, 2016, https://doi.org/10.5351/KJAS.2016.29.2.309
  7. Standard Criterion of VUS for ROC Surface vol.26, pp.6, 2013, https://doi.org/10.5351/KJAS.2013.26.6.977
  8. Alternative accuracy for multiple ROC analysis vol.25, pp.6, 2014, https://doi.org/10.7465/jkdi.2014.25.6.1521