DOI QR코드

DOI QR Code

Automatic Tagging for Social Images using Convolution Neural Networks

CNN을 이용한 소셜 이미지 자동 태깅

  • 장현웅 (한국교통대학교 컴퓨터정보공학과) ;
  • 조수선 (한국교통대학교 컴퓨터정보공학과)
  • Received : 2015.05.11
  • Accepted : 2015.10.28
  • Published : 2016.01.15

Abstract

While the Internet develops rapidly, a huge amount of image data collected from smart phones, digital cameras and black boxes are being shared through social media sites. Generally, social images are handled by tagging them with information. Due to the ease of sharing multimedia and the explosive increase in the amount of tag information, it may be considered too much hassle by some users to put the tags on images. Image retrieval is likely to be less accurate when tags are absent or mislabeled. In this paper, we suggest a method of extracting tags from social images by using image content. In this method, CNN(Convolutional Neural Network) is trained using ImageNet images with labels in the training set, and it extracts labels from instagram images. We use the extracted labels for automatic image tagging. The experimental results show that the accuracy is higher than that of instagram retrievals.

인터넷이 급속히 발달하는 가운데 스마트폰, 디지털 카메라, 블랙박스 등의 기기에서 수집되는 방대한 영상 데이터가 소셜 미디어 사이트를 통해 빠르게 공유되고 있다. 소셜 미디어 공유 사이트에서는 일반적으로 이미지의 태그 정보를 사용하는데, 멀티미디어를 공유하는 방법이 쉬워지고 그 양이 폭발적으로 증가함에 따라 이미지에 태그를 붙여야 하는 일은 번거로움이 되고 있다. 또한 태그가 잘못 붙여지거나 안 붙은 경우에는 이미지 검색 정확도가 떨어질 가능성이 있다. 본 논문에서는 이미지의 내용정보를 이용하여 자동으로 이미지로부터 태그를 추출하는 방법을 제안한다. 제안하는 방법은 ImageNet에서 제공하는 대용량의 이미지 데이터와 라벨을 CNN(Convolutional Neural Network) 딥러닝 기법으로 학습시킨 후, 인스타그램 이미지로부터 라벨 정보를 추출하는 것이다. 추출된 라벨 정보를 이용하여 자동 태깅한 후, 검색에 활용했을 때 인스타그램의 기존 검색보다 높은 정확도를 가지고 있음을 알 수 있었다.

Keywords

Acknowledgement

Supported by : 한국연구재단

References

  1. Z. Lin, G. Ding, M. Hu, J. Wang, and X. Ye, "Image Tag Completion via Image-Specific and Tag-Specific Linear Sparse Reconstructions," Proc. of the 2013 IEEE Conference on Computer Vision and Patten Recognition, pp. 1618-1625, Jun. 2013.
  2. J. Cha, S. Cho, Y. Uh, S. Kim and H. Byun, "Image annotation using tag refinement," Journal of KIISE : Software and Applications, Vol. 39, No. 8, pp. 613-620, 2012. (in Korean)
  3. S. Lee and E. Hwang, "Image Retrieval Scheme using Spatial Similarity and Annotation," Journal of KIISE : Databases, Vol. 30, No. 2, pp. 134-144, 2003. (in Korean)
  4. G. Csurka, C. Bray, C. Dance, and L. Fan, "Visual categorization with bags of keypoints," Workshop on Statistical Learning in Computer Vision, ECCV, Vol. 1, No. 1-22, May. 2004.
  5. L. Meiyu, D. Junping, J. Yingmin, and S. Zengqi, "Image Semantic Description and Automatic Semantic Annotation," Control Automation and Systems (ICCAS), 2010 International Conference on, pp. 1192-1195, 2010.
  6. H. Jang and S. Cho, "Image Classification Using Bag of Visual Words and Visual Saliency Model," KIPS Trans. Software and Data Engineering, Vol. 3, No. 12, pp. 547-552, 2014. (in Korean) https://doi.org/10.3745/KTSDE.2014.3.12.547
  7. M. Oquab, L. Bottou, I. Laptev and J. Sivic, "Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks," Computer Vision and Pattern Recognition(CVPR), 2014 IEEE Conference on, pp. 1717-1724, 2014.
  8. J. Schmidhuber, "Deep Learning in Neural Networks: An Overview," Neural Networks, 61, pp. 85-117, 2015. https://doi.org/10.1016/j.neunet.2014.09.003
  9. M. Nam, J. Kim and J. Shin, "A User motion Information Measurement Using Image and Text on Instagram-Based," Journal of Korea Multimedia Society, Vol. 17, No. 9, pp. 1125-1133, 2014. (in Korean) https://doi.org/10.9717/kmms.2014.17.9.1125
  10. D. G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," International Journal of Computer Vision, Vol. 60, No. 2, pp. 91-110, 2004. https://doi.org/10.1023/B:VISI.0000029664.99615.94
  11. P. Fischer, A. Dosovitskiy and T. Brox, "Descriptor Matching with Convolutional Neural Networks: a Comparison to SIFT," arXiv preprint arXiv:1405.5769, May. 2014.
  12. J. Wu, Y. Yu, C. Huang and K. Yu, "Deep Multiple Instance Learning for Image Classification and Auto-Annotation," Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3460-3469, 2015.
  13. A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," Advances in neural information processing systems, pp. 1097-1105, 2012.
  14. Y. Gong, Y. Jia, T. Leung, A. Toshev and S. Ioffe, "Deep Convolutional Ranking for Multilabel Image Annotation," arXiv preprint arXiv:1312.4894, 2013.
  15. G. E. Dahl, T. N. Sainath and G. E. Hinton, "Improving Deep Neural Networks for LVCSR using Rectified Linear Units and Dropout," Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, pp. 8609-8613 May. 2013.