DOI QR코드

DOI QR Code

Compression and Performance Evaluation of CNN Models on Embedded Board

임베디드 보드에서의 CNN 모델 압축 및 성능 검증

  • Moon, Hyeon-Cheol (School of Electronics and Information Engineering, Korea Aerospace University) ;
  • Lee, Ho-Young (School of Electronics and Information Engineering, Korea Aerospace University) ;
  • Kim, Jae-Gon (School of Electronics and Information Engineering, Korea Aerospace University)
  • 문현철 (한국항공대학교 항공전자정보공학부) ;
  • 이호영 (한국항공대학교 항공전자정보공학부) ;
  • 김재곤 (한국항공대학교 항공전자정보공학부)
  • Received : 2020.01.15
  • Accepted : 2020.03.04
  • Published : 2020.03.30

Abstract

Recently, deep neural networks such as CNN are showing excellent performance in various fields such as image classification, object recognition, visual quality enhancement, etc. However, as the model size and computational complexity of deep learning models for most applications increases, it is hard to apply neural networks to IoT and mobile environments. Therefore, neural network compression algorithms for reducing the model size while keeping the performance have been being studied. In this paper, we apply few compression methods to CNN models and evaluate their performances in the embedded environment. For evaluate the performance, the classification performance and inference time of the original CNN models and the compressed CNN models on the image inputted by the camera are evaluated in the embedded board equipped with QCS605, which is a customized AI chip. In this paper, a few CNN models of MobileNetV2, ResNet50, and VGG-16 are compressed by applying the methods of pruning and matrix decomposition. The experimental results show that the compressed models give not only the model size reduction of 1.3~11.2 times at a classification performance loss of less than 2% compared to the original model, but also the inference time reduction of 1.2~2.21 times, and the memory reduction of 1.2~3.8 times in the embedded board.

CNN 기반 인공신경망은 영상 분류, 객체 인식, 화질 개선 등 다양한 분야에서 뛰어난 성능을 보이고 있다. 그러나, 많은 응용에서 딥러닝(Deep Learning) 모델의 복잡도 및 연산량이 방대해짐에 따라 IoT 기기 및 모바일 환경에 적용하기에는 제한이 따른다. 따라서 기존 딥러닝 모델의 성능을 유지하면서 모델 크기를 줄이는 인공신경망 압축 기법이 연구되고 있다. 본 논문에서는 인공신경망 압축기법을 통하여 원본 CNN 모델을 압축하고, 압축된 모델을 임베디드 시스템 환경에서 그 성능을 검증한다. 성능 검증을 위해 인공지능 지원 맞춤형 칩인 QCS605를 내장한 임베디드 보드에서 카메라로 입력한 영상에 대해서 원 CNN 모델과 압축 CNN 모델의 분류성능과 추론시간을 비교 분석한다. 본 논문에서는 이미지 분류 CNN 모델인 MobileNetV2, ResNet50 및 VGG-16에 가지치기(pruning) 및 행렬분해의 인공신경망 압축 기법을 적용하였고, 실험결과에서 압축된 모델이 원본 모델 분류 성능 대비 2% 미만의 손실에서 모델의 크기를 1.3 ~ 11.2배로 압축했을 뿐만 아니라 보드에서 추론시간과 메모리 소모량을 각각 1.2 ~ 2.1배, 1.2 ~ 3.8배 감소함을 확인했다.

Keywords

References

  1. S., Han, et al, "Deep Compression: Compressing Deep Neural Networks with pruning, trained quantization and Huffman coding," In Proc. Computer Vision and Pattern Recognition (CVPR), Jun. 2015.
  2. K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Leraning for Image Recognition," In Proc. Computer Vision and Pattern Recognition (CVPR), Jun. 2016.
  3. A. Howard, M. Zhu, B. Chen, D. Kalenichenoko, W. Wang, T. Weyand, M. Andreetto, and H.Adam, "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications," In Proc. Computer Vision and Pattern Recognition (CVPR), Jul. 2017.
  4. X. Zhang, X. Zhou, M. Lin, and J. Sun, "ShuffleNet: An Exteremey Efficient Convolutional Neural Network for Mobile Devices," In Proc. Computer Vision and Patter Recognition (CVPR), Jun. 2018.
  5. S. Jung, C. Son, S. Lee, J. Han, Y. Kwak, and S. Hwang, "Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss," In Proc. Computer Vision and Pattern Recognition (CVPR), Jun. 2019.
  6. M. Jaderberg, A.Vedaldi, and A. Zisserman, "Speeding up Convolutional Neural Networks with Low Rank Exapnsions," In Proc. Computer Vision and Pattern Recognition (CVPR), Jun. 2014.
  7. V. Lebedev, Y. Ganin, M. Rakhuba, I. Oseledets, and V. Lempitsky, "Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition," In Proc. Computer Vision and Patter Recognition (CVPR), Jun. 2015.
  8. H. Moon, H. Lee, and J. Kim, "Acceleration of CNN Model Using Neural Network Compression and its Performance Evaluation on Embedded Boards," In Proc. KIBME Annual Fall Conf. Nov. 2019.
  9. QCS605 Specification, https://www.qualcomm.com/products/qcs605 (accessed Jan. 6, 2020).
  10. Tensorflow for Mobile & IoT, https://www.tensorflow.org/lite (accessed Jan. 6, 2020).
  11. Large Scale Visual Recognition Challenge 2012 (ILSVRC 2012), http://www.image-net.org/challenges/LSVRC/2012/ (accessed Jan. 6, 2020).
  12. Y. Luo, Y. Sho, Q. Huang, H. Hu, and L.Yu, "CE1 Report on Neural Network Compression of ZJU's Proposal," ISO/IEC JTC1/SC29/WG11 m50093, Oct. 2019
  13. H. Moon, J. Kim, S. Kim, S. Jang, and B. Choi, "KAU/KETI Response to the CE-1 on Neural Network Compression: CP Decomposition of Convolution Layers (Method5)," ISO/IEC JTC1/SC29/WG11 m52322, Jan. 2020.