Data Level Parallelism for H.264/AVC Decoder on a Multi-Core Processor and Performance Analysis

멀티코어 프로세서에서의 H.264/AVC 디코더를 위한 데이터 레벨 병렬화 성능 예측 및 분석

  • Published : 2009.08.25

Abstract

There have been lots of researches for H.264/AVC performance enhancement on a multi-core processor. The enhancement has been performed through parallelization methods. Parallelization methods can be classified into a task-level parallelization method and a data level parallelization method. A task-level parallelization method for H.264/AVC decoder is implemented by dividing H.264/AVC decoder algorithms into pipeline stages. However, it is not suitable for complex and large bitstreams due to poor load-balancing. Considering load-balancing and performance scalability, we propose a horizontal data level parallelization method for H.264/AVC decoder in such a way that threads are assigned to macroblock lines. We develop a mathematical performance expectation model for the proposed parallelization methods. For evaluation of the mathematical performance expectation, we measured the performance with JM 13.2 reference software on ARM11 MPCore Evaluation Board. The cycle-accurate measurement with SoCDesigner Co-verification Environment showed that expected performance and performance scalability of the proposed parallelization method was accurate in relatively high level

최근 멀티코어 프로세서의 이용이 증가함에 따라, 멀티코어환경에서 고성능 H.264/AVC 코덱을 구현하기 위한 다양한 병렬화 기법들이 제안되고 있다. 이러한 기법들은 병렬화 기법 적용 방식에 따라 태스크 레벨 병렬화 기법과 데이터 레벨 병렬화 기법으로 구분된다. 태스크 레벨 병렬화 기법을 이용한 파이프라인 병렬화 기법은 H.264 알고리즘을 파이프라인 단계로 나누어 구현하며, 일반적으로 화면 사이즈가 작고 복잡도가 낮은 비트스트림에 유리하다. 그러나 프로세싱 모듈별 수행시간 차이가 커서 로드밸런싱이 좋지 않고, 파이프라인 단계의 수가 제한적이라 성능 확장성에 제한이 있어 HD 비디오같이 해상도가 큰 비트스트림 처리에는 적합하지 않은 단점이 있다. 본 논문에서는 로드밸런싱 및 성능 확장성을 고려하여 매크로블록 라인 단위로 쓰레드를 할당하는 수평적 데이터 레벨 병렬화 기법을 제안하고, 이에 대한 성능 예측 수식 모델을 통하여 성능을 예상한다. 또한 성능 예측의 정확성을 검증하기 위해 JM 13.2 레퍼런스 디코더에 대한 데이터 레벨 병렬화 기법을 ARM11 MPCore 환경에서 구현하고 이에 대한 성능 검증을 수행하였다. SoCDesigner를 이용한 사이클 단위의 성능 측정 결과, 본 논문에서 제시하는 쓰레드 증가에 대한 병렬화 기법의 성능 변화를 비교적 높은 수준의 정확도로 예측 가능하였다.

Keywords

References

  1. ITU-T Recommendation H.264, SERIES H:AUDIOVISUAL AND MULTIMEDIA SYSTEMS Infrastructure of audiovisual services - Coding of moving video, May 2003
  2. ISO, Information Technology-Coding of Audio-Visual Objects, Part10-Advanced Video Coding, ISO/IEC14496-10
  3. Thoms Wiegand, Gary J. Sullivan, Gisle Bj${\o}$ntegaard, and Ajay Luthra, Senior Member, 'Overview of the H.264/AVC Video Coding Standard,' IEEE Transactions onCircuits and Systems for Video Technology, vol. 13, no. 7, pp. 560-576, July 2003 https://doi.org/10.1109/TCSVT.2003.815165
  4. K. Hiratia and J. Goodacre. 'ARM MPCore: the streamlined and scalable ARM11 processor core,' ASP-DAC '07, Jan. 2007 https://doi.org/10.1109/ASPDAC.2007.358077
  5. L. Dagum et al. 'OpenMP: An Industry-Standard API for Shared-Memory Programming,' IEEE Com. Sci.Eng., 5(1):46 - 55, 1998 https://doi.org/10.1109/99.660313
  6. Song Hyun Jo, Han Wook Cho, and Yong Ho Song, 'The Software Optimization of an MPEG-2 Video Decoder for Embedded Systems,' Proceedings of International Conference on Ubiquitous Convergence Technology, ICUCT / IWUCT 2008, pp. 90-95, Khabarovsk, Russia, August 2008
  7. MIchael Horowitz, Anthony Joch, Faouzi Kossentini, and Antti Hallapuro, 'H.264/AVC Baseline Profile Decoder Complexity Analysis,' IEEE Transactions on circuits and Systems for Video Technology, vol. 13, no. 7, pp. 704-716 July 2003 https://doi.org/10.1109/TCSVT.2003.814967
  8. Tom R. Jacobs, Vassilios A. chouliaras Jose L. Nunez-Yanez, 'A Thread and Data-Parallel MPEG-4 Video Encoder for a System-On-Chip Multiprocessor,' Proceedings of the16th International Conference on Application-Specific Systems, Architecture and Processors, Samos, Greece, July 2005 https://doi.org/10.1109/ASAP.2005.11
  9. Yen-Kuang Chen, Xinmin Tian, Steven Ge, and Milind Girkar, 'Implementation of H.264 Encoder and Decoder on personal Computers,' Journal of Visual Communication and Image Representation, vol. 17, no. 2, pp. 509-532, 2006 https://doi.org/10.1016/j.jvcir.2005.05.004
  10. E.B. van der Tol, E.G. Jaspers, and R.H. Gelderblom, 'Mapping of H.264 Decoding on a Multiprocessor Architecture,' Proc. SPIE Conf. on Image and Video Communications and Processing, 2003
  11. Cor Meenderinck, Arnaldo Azevedo, Maurcio Alvarez, Ben Juurlinkl, and Alex Ramirez, 'Scalability of H.264,' Proceeding of the 1stworkshop on Programmability Issues for Multi-Core Computers, Gotegorg, Sweden, January 2008
  12. J. Chong, N. R. Satish, B. Catanzaro, K. Ravindran, and K. Keutzer, 'Efficient Prarllelization of H.264 Decoding with Macro Block Level Scheduling,' IEEE International Conference on Multimedia and Expo, pp. 1874-1877, July 2007 https://doi.org/10.1109/ICME.2007.4285040
  13. JM Reference Software 13.2, http://iphome.hhi.de/suehring/tml/download/old_jm/jm13.2.zip
  14. V.V. Dimakopoulos, E. Leontialdis, and G. Tzoumas, 'A portable C compiler for OpenMPV.2.2,' in Proceedings of the WEOMP 2003, 5th European Workshop on OpenMP, Sept. 2003
  15. ARM-ESL SocDesigner, http://www.arm.com
  16. Chunhua Liao, Zhenying LIu, Lei Huang, and Barbara Chapman, 'Evaluating OpenMP onChip MultiThreading Platforma,' First International Workshop on OpenMp, Eugene, USA, June, 2005