DOI QR코드

DOI QR Code

Multivariate empirical distribution functions and descriptive methods

다변량 경험분포함수와 시각적인 표현방법

  • Hong, Chong Sun (Department of Statistics, Sungkyunkwan University) ;
  • Park, Jun (Department of Statistics, Sungkyunkwan University) ;
  • Park, Yong Ho (Department of Statistics, Sungkyunkwan University)
  • 홍종선 (성균관대학교 통계학과) ;
  • 박준 (성균관대학교 통계학과) ;
  • 박용호 (성균관대학교 통계학과)
  • Received : 2016.11.23
  • Accepted : 2017.01.16
  • Published : 2017.01.31

Abstract

The multivaiate empirical distribution function (MEDF) is defined in this work. The MEDF's expectation and variance are derived and we have shown the MEDF converges to its real distribution function. Based on random samples from bivariate standard normal distribution with various correlation coefficients, we also obtain MEDFs and propose two kinds of graphical methods to visualize MEDFs on two dimensional plane. One is represented with at most n stairs with similar arguments as the step function, and the other is described with at most n curves which look like bivariate quantile vector. Even though these two descriptive methods could be expressed with three dimensional space, two dimensional representation is obtained with ease and it is enough to explain characteristics of bivariate distribution functions. Hence, it is possible to visualize trivariate empirical distribution functions with three dimensional quantile vectors. With bivariate and four variate illustrative examples, the proposed MEDFs descriptive plots are obtained and explored.

일변량 이상의 다변량 경험분포함수의 정의를 새롭게 제안하고, 경험분포함수의 기대값과 분산을 유도하면서 다변량 경험분포함수가 실제의 분포함수로 수렴함을 확인한다. 그리고 다양한 상관계수의 이변량 표준정규분포에서 추출한 확률표본을 바탕으로 이변량 경험분포함수를 구하고 이를 이차원 평면에 시각적으로 표현하는 두 종류의 그래픽적인 방법을 제안한다. 하나는 계단으로 표현하여 계단식 함수와 유사한 성격을 갖고 있는 방법이고, 다른 하나는 이변량 분위벡터로 설명되는 그림 방법이다. 두 종류의 시각적인 표현 방법은 삼차원으로 표현할 수 있으나 이차원 평면으로도 쉽게 구현이 가능하며, 일반적으로 이변량 누적분포함수의 모든 특징을 충분히 설명할 수 있다. 따라서 삼변량 경험분포함수를 시각적 표현이 가능함을 보인다. 이변량과 사변량의 실증 예제를 통하여 본 연구에서 제안한 다변량 경험분포함수와 이차원 평면에 표현하는 시각적인 표현 방법들을 구현하고 탐색한다.

Keywords

References

  1. D'Agostino, R. B. and Stephens, M. A. (1986). Goodness-of-fit techniques. Statistics, a series of textbooks and monographs, Marcell Dekker Inc., 68, New York.
  2. Gnanadesikan, R. and Kettenring, J. R. (1972). Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics, 28, 81-124. https://doi.org/10.2307/2528963
  3. Gnanadesikan, R., Kettenring, J. R. and Landwehr, J. M. (1977). Interpreting and assessing the results of cluster analyses. Bulletin of the International Statistical Institute, 47, 451-463.
  4. Hong. C. S. (2016). Statistical Probability Distributions, revised Ed., Freedom academy, Seoul.
  5. Hong. C. S. and Kwon. T. W. (2010). Distribution fitting for the rate of return and value at risk. Journal of the Korean Data & Information Science Society, 21, 219-229.
  6. Hong. C. S. and Lee. G. P.(2016). Properties of alternative VaR for multivariate normal distributions. Journal of the Korean Data & Information Science Society, 27, 1453-1463. https://doi.org/10.7465/jkdi.2016.27.6.1453
  7. Justel, A., Pena, D. and Zamar, R. (1997). A multivariate Kolmogorov-Smirnov test of goodness of fit. Statistics & Probability Letters, 35, 251-259. https://doi.org/10.1016/S0167-7152(97)00020-5
  8. Kim, N. H. (2004). An approximate shapiro-wilk statistic for testing multivariate normality. Korean Journal of Applied Statistics, 17, 35-47. https://doi.org/10.5351/KJAS.2004.17.1.035
  9. Kim, N. H. (2005). The limit distribution of an invariant test statistic for multivariate normality. Communications for Statistical Applications and Methods, 12, 71-86. https://doi.org/10.5351/CKSS.2005.12.1.071
  10. Kim, N. H. (2006). Testing multivariate normality based on EDF statistics. Korean Journal of Applied Statistics, 19, 241-256. https://doi.org/10.5351/KJAS.2006.19.2.241
  11. Koziol, J. A. (1982). A class of invariant procedures for assessing multivariate normality. Biometrika, 69, 423-427. https://doi.org/10.1093/biomet/69.2.423
  12. Malkovich, J. F. and Afifi, A. A. (1973). On tests for multivariate normality. Journal of the american statistical association, 68, 176-179. https://doi.org/10.1080/01621459.1973.10481358
  13. Meintanis, S. G. and Hlavka, Z. (2010). Goodness-of-fit tests for bivariate and multivariate skew-normal distributions. Scandinavian Journal of Statistics, 37, 701-714. https://doi.org/10.1111/j.1467-9469.2009.00687.x
  14. Moore, D. S. and Stubblebine, J. B. (1981). Chi-square tests for multivariate normality with application to common stock prices. Communications in Statistics-Theory and Methods, 10, 713-738. https://doi.org/10.1080/03610928108828070
  15. Rosenblatt, M. (1952). Remarks on a multivariate transformation. The annals of mathematical statistics, 23, 470-472. https://doi.org/10.1214/aoms/1177729394
  16. Roy, S. N. (1953). On a heuristic method of test construction and its use in multivariate analysis. The Annals of Mathematical Statistics, 24, 220-238. https://doi.org/10.1214/aoms/1177729029
  17. Royston, J. P. (1983). Some techniques for assessing multivarate normality based on the shapiro-wilk W. Applied Statistics, 32, 121-133. https://doi.org/10.2307/2347291
  18. Singh, A. (1993). Omnibus robust procedures for assessment of multivariate normality and detection of multivariate outliers. Multivariate environmental statistics, North-Holland, Amsterdam, 445-488.
  19. Thode, H. C. (2002). Testing for normality. Marcell Dekker Inc., New York, 164.
  20. Zhu, L. X., Fang, K. T. and Bhatti, M. I. (1997). On estimated projection pursuit-type cramer-von mises statistics. Journal of multivariate analysis, 63, 1-14. https://doi.org/10.1006/jmva.1997.1673

Cited by

  1. 다변량 경험분포그림과 적합도 검정 vol.30, pp.4, 2017, https://doi.org/10.5351/kjas.2017.30.4.579
  2. Discriminant analysis using empirical distribution function vol.28, pp.5, 2017, https://doi.org/10.7465/jkdi.2017.28.5.1179
  3. R의 Shiny를 이용한 시각화 분석 활용 사례 vol.28, pp.6, 2017, https://doi.org/10.7465/jkdi.2017.28.6.1279