DOI QR코드

DOI QR Code

Decombined Distributed Parallel VQ Codebook Generation Based on MapReduce

맵리듀스를 사용한 디컴바인드 분산 VQ 코드북 생성 방법

  • 이현진 (숭실사이버대학교 컴퓨터정보통신학과)
  • Received : 2014.04.28
  • Accepted : 2014.06.16
  • Published : 2014.06.30

Abstract

In the era of big data, algorithms for the existing IT environment cannot accept on a distributed architecture such as hadoop. Thus, new distributed algorithms which apply a distributed framework such as MapReduce are needed. Lloyd's algorithm commonly used for vector quantization is developed using MapReduce recently. In this paper, we proposed a decombined distributed VQ codebook generation algorithm based on a distributed VQ codebook generation algorithm using MapReduce to get a result more fast. The result of applying the proposed algorithm to big data showed higher performance than the conventional method.

빅 데이터(Big Data)시대로 접어들면서 기존의 IT 환경에서 만들어진 알고리즘들은 하둡과 같은 분산 아키텍처에 그대로 적용할 수 없거나 효율이 떨어진다. 따라서, 맵리듀스와 같은 분산 프레임워크를 적용한 새로운 알고리즘들이 필요하다. 벡터 양자화에 많이 사용되는 Lloyd의 알고리즘도 맵리듀스를 사용하여 개발이 이루어지고 있다. 본 논문에서는 기존의 맵리듀스를 사용한 분산 VQ 코드북 생성 알고리즘을 수정하여 좀 더 빠른 분석 결과를 보일 수 있는 디컴바인드 분산 VQ 코드북 생성 알고리즘을 제안하였다. 제안하는 알고리즘을 빅 데이터에 적용한 결과 기존 방법보다 높은 성능을 보인 것을 확인할 수 있었다.

Keywords

References

  1. Tzu-Chuen Lu, and Ching-Yun Chang, "A Survey of VQ Codebook Generation," Journal of Information Hiding and Multimedia Signal Processing, vol. 1, no. 3, pp. 190-203, 2010.
  2. C. W. Tsai, C. Y. Lee, M. C. Chiang, and C. S. Yang, "A Fast VQ Codebook Generation Algorithm via Pattern Reduction, Pattern Recognition Letters," vol. 30, pp. 653-660, 2009. https://doi.org/10.1016/j.patrec.2009.02.003
  3. Xiao-Gang W, and Yue L, "Web mining based on user access patterns for web personalization," ISECS International Colloquium on Computing, Communication, Control, and Management. 1: 194-197, 2009.
  4. J. Dean and S. Ghemowat, "MapReduce : Simplified data processing on large clusters," in OSDI, 2004.
  5. Seongeun Yang, Changyeol Choi, Hwangkyu Choi, "Design and Implementation of Vehicle Route Track ing System using Hadoop-Based Bigdata Image Processing," Journal of Digital Contents Society, vol.14, no.4, pp.447-454, 2013. https://doi.org/10.9728/dcs.2013.14.4.447
  6. Yang, Hadoop, "http://hadoop.apache.org/"
  7. S. Ghemowat, H. Gobioff, and S. T. Leung, "The Google file system," 19th Symposium on Operating Systems Principles, pp. 29-43, 2003.
  8. Sim, Gyu-Seok ; Kim, Yeong-Hun ; Lee, Jeong-Hun ; Kim, Jin-Hyeon ; Park, Yun-Jae, "Current Research Trends in MapReduce Algorithms for Big Data Analysis," Communications of the Korean Institute of Information Scientists and Engineer, Vol. 32, No. 1, pp. 27-32, 2014.
  9. Krishnamoorthy R, Kalpana J, "Minimum distortion clustering technique for orthogonal polynomials transform vector quantizer," Proc. 2011 Inter. Conf. Communication, Computing & Security. 443-448, 2011.
  10. Dumitrescu S, Wu X, "On properties of locally optimal multiple description scalar quantizers with convex cells," IEEE Trans. Inform. Theor. 55: 5591-5606, 2009. https://doi.org/10.1109/TIT.2009.2032831
  11. P. Zhou, J. Lei, and W. Ye, "Large-Scale Data Sets Clustering Based on MapReduce and Hadoop," Journal of Computational Information systems, vol. 7, No. 16, pp. 5956-5963, 2011.
  12. H. Maulik, and S. Bandyopadhyay. "Genetic Algorithm-Based Clustering Technique," Pattern Recognition, Vol.33, pp. 1455-1465, 2000. https://doi.org/10.1016/S0031-3203(99)00137-5
  13. D. Arthur and S. Vassilvitskii. "K-Means++: The Advantage of Careful Seeding," Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2007.
  14. Lin G., Zhonghua S., Zhiqiang M., Xiang G., Charles Z., and Yoohui J., "K-Means of Cloud Computing: MapReduce, DVM, and Windows Azure," in CLOUD COMPUTING 2013, pp. 13-18, 2013.

Cited by

  1. A Quality Evaluation Model for Distributed Processing Systems of Big Data vol.15, pp.4, 2014, https://doi.org/10.9728/dcs.2014.15.4.533
  2. Online VQ Codebook Generation using a Triangle Inequality vol.16, pp.3, 2015, https://doi.org/10.9728/dcs.2015.16.3.373
  3. An Algorithms for Tournament-based Big Data Analysis vol.16, pp.4, 2015, https://doi.org/10.9728/dcs.2015.16.4.545