DOI QR코드

DOI QR Code

Instagram image classification with Deep Learning

딥러닝을 이용한 인스타그램 이미지 분류

  • Jeong, Nokwon (Department of Computer Science and Information Engineering, Korea National University of Transportation) ;
  • Cho, Soosun (Department of Computer Science and Information Engineering, Korea National University of Transportation)
  • Received : 2017.06.05
  • Accepted : 2017.07.27
  • Published : 2017.10.31

Abstract

In this paper we introduce two experimental results from classification of Instagram images and some valuable lessons from them. We have tried some experiments for evaluating the competitive power of Convolutional Neural Network(CNN) in classification of real social network images such as Instagram images. We used AlexNet and ResNet, which showed the most outstanding capabilities in ImageNet Large Scale Visual Recognition Challenge(ILSVRC) 2012 and 2015, respectively. And we used 240 Instagram images and 12 pre-defined categories for classifying social network images. Also, we performed fine-tuning using Inception V3 model, and compared those results. In the results of four cases of AlexNet, ResNet, Inception V3 and fine-tuned Inception V3, the Top-1 error rates were 49.58%, 40.42%, 30.42%, and 5.00%. And the Top-5 error rates were 35.42%, 25.00%, 20.83%, and 0.00% respectively.

본 논문에서는 딥러닝의 회선신경망을 이용한 실제 소셜 네트워크 상의 이미지 분류가 얼마나 효과적인지 알아보기 위한 실험을 수행하고, 그 결과와 그를 통해 알게 된 교훈에 대해 소개한다. 이를 위해 ImageNet Large Scale Visual Recognition Challenge(ILSVRC)의 2012년 대회와 2015년 대회에서 각각 우승을 차지한 AlexNet 모델과 ResNet 모델을 이용하였다. 평가를 위한 테스트 셋으로 인스타그램에서 수집한 이미지를 사용하였으며, 12개의 카테고리, 총 240개의 이미지로 구성되어 있다. 또한, Inception V3모델을 이용하여 fine-tuning을 실시하고, 그 결과를 비교하였다. AlexNet과 ResNet, Inception V3, fine-tuned Inception V3 이 네 가지 모델에 대한 Top-1 error rate들은 각각 49.58%, 40.42%, 30.42% 그리고 5.00%로 나타났으며, Top-5 error rate들은 각각 35.42%, 25.00%, 20.83% 그리고 0.00%로 나타났다.

Keywords

References

  1. L.A. Gatys, A.S. Ecker, M. Bethge,"A Neural Algorithm of Artistic Style", arXiv:1508.06576, 2015. https://doi.org/10.1167/16.12.326
  2. H. Kagaya, K. Aizawa, M. Ogawa "Food Detection and Recognition Using Convolutional Neural Network", MM '14 Proceedings of the 22nd ACM international conference on Multimedia, Pages 1085-1088, 2014. https://doi.org/10.1145/2647868.2654970
  3. F. N. Iandola, A. Shen, P. Gao, K. Keutzer, "DeepLogo: Hitting Logo Recognition with the Deep Neural Network Hammer", arXiv:1510.02131, 2015. https://arxiv.org/abs/1510.02131
  4. A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet Classification with Deep Convolution Neural Networks", In NIPS,, 2012. https://doi.org/10.1145/3065386
  5. M. Lin, Q. Chen, S. Yan," Network In Network", arXiv:1312.4400, 2013. https://arxiv.org/abs/1312.4400
  6. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, "Going Deeper with Convolutions", arXiv:1409.4842, 2014. https://doi.org/10.1109/cvpr.2015.7298594
  7. K. Simonyan, A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition", arXiv:1409.1556, 2014. https://arxiv.org/abs/1409.1556
  8. K. He, X. Zhang, S. Ren, J. Sun, "Deep Residual Learning for Image Recognition", arXiv:1512.03385, 2015. https://doi.org/10.1109/cvpr.2016.90
  9. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, "Rethinking the Inception Architecture for Computer Vision", arXiv:1512.00567, 2015. https://doi.org/10.1109/cvpr.2016.308
  10. K. He, X. Zhang, S. Ren, J. Sun, "Identity Mappings in Deep Residual Networks", arXiv:1603.05027, 2016. https://doi.org/10.1007/978-3-319-46493-0_38
  11. H. Jang, S. Cho, "Automatic Tagging for Social Images using Convolution Neural Networks", Journal of KIISE, Vol 43,No. 1, pp. 47-53, 2016. https://doi.org/10.5626/jok.2016.43.1.47
  12. K. Tang, M. Paluri, L. Fei-Fei, R. Fergus, L. Bourdev, "Improving Image Classification with Location Context", arXiv:1505.03873v1, 2015. https://doi.org/10.1109/iccv.2015.121
  13. S. Cho, "Web Image Classification using Semantically Related Tags and Image Content", Journal of Internet Computing and Services, v.11, no.3, pp.15-24, 2010. http://ksci.kisti.re.kr/search/article/articleView.ksci?arti cleBean.atclMgntNo=OTJBCD_2010_v11n3_15
  14. Y. Mussarat, S. Muhammad, M. Sajjad and I. Isma, "Content Based Image Retrieval Using Combined Features of Shape, Color and Relevance Feedback" , KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS VOL. 7, NO. 12, pp.3149, 2013. https://doi.org/10.3837/tiis.2013.12.011
  15. G. Kim, T. An, M. Kim, "Estimation of Crowd Density in Public Areas Based on Neural Network", KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS VOL. 6, NO. 9, pp.2170, 2012. https://doi.org/10.3837/tiis.2012.09.011
  16. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg and L. Fei-Fei. "ImageNet Large Scale Visual Recognition Challenge." IJCV, 2015. https://doi.org/10.1007/s11263-015-0816-y
  17. A. Vedaldi and K. Lenc, "MatConvNet - Convolutional Neural Networks for MATLAB", Proc. of the ACM Int. Conf. on Multimedia, 2015. https://doi.org/10.1145/2733373.2807412
  18. M. Abadi, A. Agarwal, P. Barham, E. Brevdo, "TensorFlow : Large-Scale Machine Learning on Heterogeneous Distributed Systems", arXiv:1603.04467, 2015. https://arxiv.org/abs/1603.04467
  19. R. Girshick, "Fast R-CNN", arXiv:1504.08083, 2015. https://doi.org/10.1109/iccv.2015.169
  20. K. He, X. Zhang, S. Ren, J. Sun, "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition", arXiv:1406.4729, 2014. https://doi.org/10.1007/978-3-319-10578-9_23
  21. S. Ren, K. He, R. Girshick, J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", arXiv:1506.01497, 2015. https://doi.org/10.1109/tpami.2016.2577031