Sample-spacing Approach for the Estimation of Mutual Information

Huh, Moon-Yul;Cha, Woon-Ock;

doi:10.5351/KJAS.2008.21.2.301

The Korean Journal of Applied Statistics (응용통계연구)

Volume 21 Issue 2
/
Pages.301-312
/
2008
/
1225-066X(pISSN)
/
2383-5818(eISSN)

The Korean Statistical Society (한국통계학회)

DOI QR Code

Sample-spacing Approach for the Estimation of Mutual Information

SAMPLE-SPACING 방법에 의한 상호정보의 추정

Huh, Moon-Yul (Dept. of Statistics, Sungkyunkwan University) ;
Cha, Woon-Ock (Dept. of Multimedia Engineering, Hansung University)

허문열 (성균관대학교 통계학과) ;
차운옥 (한성대학교 공과대학 멀티미디어공학과)

Published : 2008.04.30

https://doi.org/10.5351/KJAS.2008.21.2.301 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Mutual information is a measure of association of explanatory variable for predicting target variable. It is used for variable ranking and variable subset selection. This study is about the Sample-spacing approach which can be used for the estimation of mutual information from data consisting of continuous explanation variables and categorical target variable without estimating a joint probability density function. The results of Monte-Carlo simulation and experiments with real-world data show that m = 1 is preferable in using Sample-spacing.

상호정보(mutual information: MI)는 설명변수의 목적변수에 대한 예측정도를 나타내는 척도로서, 목적변수에 대한 설명 변수의 중요도 순위를 구하거나 목적 변수를 잘 설명해주는 설명변수의 집합을 구하는 변수선택문제에 유용하게 사용된다. 본 논문에서는 연속형 설명변수와 범주형 목적변수로 구성된 데이터로부터 결합확률분포를 추정하지 않고도 MI 추정량을 구할 수 있는 Sample-spacing 방법에 대한 연구를 수행하였다. 몬테 칼로 모의 실험과 실제데이터에 대한 실험결과, MI 추정을 위해 Sample-spacing 방법을 사용할 때 m = 1을 사용하면 충분히 신뢰할만한 결과를 얻을 수 있다는 것을 알 수 있었다.

Keywords

References

Ahmad, I. A. and Lin, P. E. (1976). A nonparametric estimation of the entropy for ab-solutely continuous distribution, IEEE Transactions on Information Theory, 22, 372-375 https://doi.org/10.1109/TIT.1976.1055550
Ahmed, N. A. and Gokhale, D. V. (1989). Entropy expressions and their estimators for multivariate distribution, IEEE Transactions on Information Theory, 35, 688-692 https://doi.org/10.1109/18.30996
Beirlant, J., Dudewicz, E. J., Gyorfi, L. and Meulen, E. (1997). Nonparametric entropy es-timation: An overview, International Journal of Mathematical and Statistical Sciences, 6, 17-39
Blake, C. and Merz, C. J. UCI machine learning repository, http://www.ics.uci.edu/ mlearn/MLRepository
Brillinger, D. R. (2004). Some data analysis using mutual information, Brazilian Journal of Probability and Statistics, 18, 163-183
Cover, T. M. and Thomas, J. A. (1991). Elements of Information Theory, John Wiley & Sons, New York
Dmitriev, G., Yu, G. and Tarasenko, F. P. (1973). On the estimation of functionals of the probability density and its derivatives, Theory of Probability and Its Applications, 18, 628-633 https://doi.org/10.1137/1118083
Huh, M. Y. (2005). DAVIS(http://stat.skku.ac.kr/myhuh/DAVIS.html)
Joe, H. (1989). Estimation of entropy and other functionals of a multivariate density, Annals of Institute of Statistical Mathematics, 41, 683-697 https://doi.org/10.1007/BF00057735
Lazo, A. V. and Rathie, P. (1978). On the entropy of continuous probability distributions, IEEE Transactions on Information Theory, 24, 120-122 https://doi.org/10.1109/TIT.1978.1055832
Miller, E. G. L. and Fisher III, J. W. (2003). ICA using spacings estimates of entropy, The Journal of Machine Learning Research, 4, 1271-1295 https://doi.org/10.1162/jmlr.2003.4.7-8.1271
Mokkadem, A. (1989). Estimation of the entropy and information of absolutely continuous random variables, IEEE Transactions on Information Theory, 35, 193-196 https://doi.org/10.1109/18.42194
Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis, Chapman & Hall/CRC, London
Tsybakov, A. B. and van der Meulen, E. C. (1994). Root-n consistent estimators of entropy for densities with unbounded support, Scandinavian Journal of Statistics, 23, 75-83
van Hulle, M. M. (2002). Multivariate edgeworth-based entropy estimation, Neural Com-putation, 14, 1887-1906 https://doi.org/10.1162/089976602760128054

The Korean Journal of Applied Statistics (응용통계연구)

Sample-spacing Approach for the Estimation of Mutual Information

SAMPLE-SPACING 방법에 의한 상호정보의 추정

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)