DOI QR코드

DOI QR Code

A Method of Eye and Lip Region Detection using Faster R-CNN in Face Image

초고속 R-CNN을 이용한 얼굴영상에서 눈 및 입술영역 검출방법

  • Received : 2018.03.02
  • Accepted : 2018.08.20
  • Published : 2018.08.28

Abstract

In the field of biometric security such as face and iris recognition, it is essential to extract facial features such as eyes and lips. In this paper, we have studied a method of detecting eye and lip region in face image using faster R-CNN. The faster R-CNN is an object detection method using deep running and is well known to have superior performance compared to the conventional feature-based method. In this paper, feature maps are extracted by applying convolution, linear rectification process, and max pooling process to facial images in order. The RPN(region proposal network) is learned using the feature map to detect the region proposal. Then, eye and lip detector are learned by using the region proposal and feature map. In order to examine the performance of the proposed method, we experimented with 800 face images of Korean men and women. We used 480 images for the learning phase and 320 images for the test one. Computer simulation showed that the average precision of eye and lip region detection for 50 epoch cases is 97.7% and 91.0%, respectively.

얼굴인식, 홍채인식과 같은 생체보안 분야에서 눈, 코, 입술 등 얼굴특징을 추출하는 과정은 필수적이다. 본 논문은 초고속(faster) R-CNN을 이용하여 얼굴영상에서 눈 및 입술영역을 검출하는 방법을 연구하였다. 초고속 R-CNN은 딥러닝을 이용한 물체검출 방법으로 기존의 특징기반 방법에 비해 성능이 우수한 것으로 알려져 있다. 본 논문에서는 얼굴영상에 콘볼루션, 선형정류과정, max pooling과정을 차례로 적용하여 특징맵을 추출하고 이로부터 제안영역(region proposal)을 검출하는 RPN(region proposal network)을 학습한다. 그리고 제안영역과 특징맵을 이용하여 눈 및 입술 검출기(detector)를 학습한다. 제안방법의 성능을 검토하기 위해 남녀한국인 얼굴영상 800장으로 실험하였다. 학습을 위해 480장을 이용했으며 테스트용으로 320장을 사용하였다. 컴퓨터모의 실험결과 눈 및 입술영역 검출의 평균정확도는 50 에포치일 때 각각 97.7%, 91.0%를 얻을 수 있었다.

Keywords

References

  1. S. Zafeiriou, C. Zhang & Z. Zhang. (2015). A survey on face detection in the wild: past, present and future. Computer Vision and Image Understanding, 138, 1-24. DOI : 10.1016/j.cviu.2015.03.015
  2. Vinay Kumar, Arpit Agarwal & Kanika Mittal. (2011). Tutorial: introduction to emotion recognition for digital images, [Technical report] , 1-47.
  3. Oya Celiktutan, Sezer Ulukaya & Bulent Sankur, (2013). A comparative study of face landmarking techniques. EURASIP Journal on Image and Video Processing, 13. DOI : 10.1186/1687-5281-2013-13
  4. S. Y. Kwon et al. (2012). Comparative performaance evaluations of eye detection algorithm. Journal of Korea Multimedia Society, 15(6), 722-730. https://doi.org/10.9717/kmms.2012.15.6.722
  5. Dan Witzner Hansen & Qiang Ji. (2010). In the eye of the beholder: a survey of models for eyes and gaze. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 478-500. DOI : 10.1109/TPAMI.2009.30
  6. Waqas, Haider, Hadia Bashir, Abida Sharif, Irfan Sharif & Abdul Wahab1Smola. (2014). A survey on face detection and recognition approaches. Research Journal of Recent Sciences, 3(4), 56-62.
  7. A. Al-Rahayfeh & M. Faezipour. (2013). Eye tracking and head movement detection: a state-of-art survey. IEEE Journal of Translational Engineering in Health and Medicine, 1. DOI: 10.1109/JTEHM.2013.2289879
  8. P. Viola & M. Jones. (2004). Robust real-time object detection. International Journal of Computer Vision 57(2), 137-154. DOI : 10.1023/B:VISI.0000013087.49260.fb
  9. Y. L. Cun, Y. Bengio, & G. Hinton. (2015. May). Deep learning. Nature, 521(7553), 436-444. DOI : 10.1038/nature14539
  10. J. Schmidhuber. (2015). Deep learning in neural networks: an overview. Neural Networks, 61, 85-117. DOI : 10.1016/j.neunet.2014.09.003
  11. R. Girshick, J. Donahue, T. Darrell & J. Malik. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5.
  12. Girshick, Ross. (2015). Fast r-cnn. Proceedings of the IEEE International Conference on Computer Vision.(ICCV 2015).
  13. S. Ren, K. He, R. Girshick, & J. Sun. (2017). Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 1137-1149. DOI: 10.1109/TPAMI.2016.2577031
  14. A.J. Smola. & B. Scholkopf. (2004). A tutorial on support vector regression. Statistics and Computing 14(3), 199-222. https://doi.org/10.1023/B:STCO.0000035301.49549.88
  15. J. R. Uijlings, K. E. van de Sande, T. Gevers & A. W. Smeulders. (2013). Selective search for object recognition. International Journal of Computer Vision, 154-171.
  16. C. L. Zitnick & P. Dollar. (2014). Edge boxes: Locating object proposals from edges. in European Conference on Computer Vision (ECCV), 391-405.
  17. https://kr.mathworks.com/help/vision/examples
  18. Y. Zhang, K. Lee & H. Lee. (2016). Augmenting supervised neural networks with unsupervised objectives for large-scale image classification. International Conference on Machine Learning(ICML), 612-621.
  19. K. Kim, et al. (2017), Detail focused image classifier model for traditional images. Journal of the Korea Convergence Society, 8(12), 85-92. DOI: 10.15207/JKCS.2017.8.12.085