Optimal Thresholds from Non-Normal Mixture

Hong, Chong-Sun;Joo, Jae-Seon;

doi:10.5351/KJAS.2010.23.5.943

The Korean Journal of Applied Statistics (응용통계연구)

Volume 23 Issue 5
/
Pages.943-953
/
2010
/
1225-066X(pISSN)
/
2383-5818(eISSN)

The Korean Statistical Society (한국통계학회)

DOI QR Code

Optimal Thresholds from Non-Normal Mixture

비정규 혼합분포에서의 최적분류점

Hong, Chong-Sun (Department of Statistics, Sungkyunkwan University) ;
Joo, Jae-Seon (Statistics and Panel Center, Korean Women's Development Institute)

홍종선 (성균관대학교 통계학과) ;
주재선 (한국여성정책연구원 통계패널센터)

Received : 20100600
Accepted : 20100800
Published : 2010.10.31

https://doi.org/10.5351/KJAS.2010.23.5.943 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

From a mixture distribution of the score random variable for credit evaluation, there are many methods of estimating optimal thresholds. Most the research news is based on the assumption of normal distributions. In this paper, we extend non-normal distributions such as Weibull, Logistic and Gamma distributions to estimate an optimal threshold by using a hypotheses test method and other methods maximizing the total accuracy and the true rate. The type I and II errors are obtained and compared with their sums. Finally we discuss their e ciency and derive conclusions for non-normal distributions.

신용평가연구에서 확률변수 스코어와 정상과 부도상태의 모수공간으로 정의된 혼합분포에서 확률밀도함수의 관계식으로 최적분류점을 추정하고 이에 대응하는 오류합의 크기를 비교하는 연구가 정규분포의 가정하에 이루어져있는데 본 연구에서는 비정규분포인 와이블, 로지스틱 그리고 감마분포로 확장하여 가설검정을 이용하는 방법과 전체정확도와 진실율을 최대화하는 방법에 의한 최적분류점을 각각 구하고 최적분류점에 대응하는 제I종과 제II종 오류합의 크기를 비교하여 효율성을 비교 토론한다.

Keywords

References

홍종선, 주재선, 최진수 (2010). 혼합분포에서의 최적분류점, <응용통계연구>, 23, 13-28.
홍종선, 최진수 (2009). ROC와 CAP 곡선에서의 최적분류점, <응용통계연구>, 22, 911-921.
Bairagi, R. and Suchindran, C. M. (1989). An estimator of the cutoff point maximizing sum of sensitivity and specificity, The Indian Journal of Statistics, 51, 263-269.
Berry, M. J. A. and Linoff, G. (1999). Data Mining Techniques: For Marketing, Sales, and Customer Support, Morgan Kaufmann Publishers.
Drummond, C. and Holte, R. C. (2006). Cost curves: An improved method for visualizing classifier performance, Machine Learning, 65, 95-130. https://doi.org/10.1007/s10994-006-8199-5
Engelmann, B., Hayden, E. and Tasche, D. (2003). Measuring the discriminative power of rating systems, Discussion paper, Series 2: Banking and Financial Supervision.
Hanley, A. and McNeil, B. (1982). The meaning and use of the area under a receiver operating characteristics curve, Diagnostic Radiology, 143, 29-36.
Perkins, N. J. and Schisterman, E. F. (2006). The inconsistency of optimal cutpoints obtained using two criteria based on the receiver operating characteristic curve, American Journal of Epidemiology, 163, 670-675. https://doi.org/10.1093/aje/kwj063
Provost, F. and Fawcett, T. (1997). Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions, Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, 43-48.
Sobehart, J. R. and Keenan, S. C. (2001). Measuring default accurately, Credit Risk Special Report, Risk, 14, March, 31-33.
Tasche, D. (2006). Validation of internal rating systems and PD estimates, on-line bibliography available from: http://arxiv.org/abs/physics/0606071.
Velez, D. R., White, B. C., Motsinger, A. A., Bush, W. S., Ritchie, M. D., Willianms, S. M. and Moore, J. H. (2007). A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor. dimensionality reduction, Genetic Epidemiology, 31, 306-315. https://doi.org/10.1002/gepi.20211
Vuk, M. and Curk, T. (2006). ROC curve, lift chart and calibration plot, Metodoloki zvezki, 3, 89-108.
Youden, W. J. (1950). Index for rating diagnostic tests, Cancer, 3, 32-35. https://doi.org/10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3
Zou, K. H. (2002). Receiver operating characteristic literature research, on-line bibliography available from: http://www.spl.harvard.edu/pages/ppl/zou/roc.html.

Cited by

Alternative Optimal Threshold Criteria: MFR vol.27, pp.5, 2014, https://doi.org/10.5351/KJAS.2014.27.5.773
Alternative accuracy for multiple ROC analysis vol.25, pp.6, 2014, https://doi.org/10.7465/jkdi.2014.25.6.1521
Optimal thresholds criteria for ROC surfaces vol.24, pp.6, 2013, https://doi.org/10.7465/jkdi.2013.24.6.1489
Bivariate ROC Curve and Optimal Classification Function vol.19, pp.4, 2012, https://doi.org/10.5351/CKSS.2012.19.4.629
Statistical Fingerprint Recognition Matching Method with an Optimal Threshold and Confidence Interval vol.25, pp.6, 2012, https://doi.org/10.5351/KJAS.2012.25.6.1027
Two optimal threshold criteria for ROC analysis vol.26, pp.1, 2015, https://doi.org/10.7465/jkdi.2015.26.1.255

The Korean Journal of Applied Statistics (응용통계연구)

Optimal Thresholds from Non-Normal Mixture

비정규 혼합분포에서의 최적분류점

Abstract

Keywords

References

Cited by

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)