DOI QR코드

DOI QR Code

3D Facial Landmark Tracking and Facial Expression Recognition

  • Medioni, Gerard (Computer Vision Lab, Institute for Robotics and Intelligent Systems, University of Southern California) ;
  • Choi, Jongmoo (Computer Vision Lab, Institute for Robotics and Intelligent Systems, University of Southern California) ;
  • Labeau, Matthieu (Computer Vision Lab, Institute for Robotics and Intelligent Systems, University of Southern California) ;
  • Leksut, Jatuporn Toy (Computer Vision Lab, Institute for Robotics and Intelligent Systems, University of Southern California) ;
  • Meng, Lingchao (Computer Vision Lab, Institute for Robotics and Intelligent Systems, University of Southern California)
  • Received : 2013.04.03
  • Accepted : 2013.05.10
  • Published : 2013.09.30

Abstract

In this paper, we address the challenging computer vision problem of obtaining a reliable facial expression analysis from a naturally interacting person. We propose a system that combines a 3D generic face model, 3D head tracking, and 2D tracker to track facial landmarks and recognize expressions. First, we extract facial landmarks from a neutral frontal face, and then we deform a 3D generic face to fit the input face. Next, we use our real-time 3D head tracking module to track a person's head in 3D and predict facial landmark positions in 2D using the projection from the updated 3D face model. Finally, we use tracked 2D landmarks to update the 3D landmarks. This integrated tracking loop enables efficient tracking of the non-rigid parts of a face in the presence of large 3D head motion. We conducted experiments for facial expression recognition using both framebased and sequence-based approaches. Our method provides a 75.9% recognition rate in 8 subjects with 7 key expressions. Our approach provides a considerable step forward toward new applications including human-computer interactions, behavioral science, robotics, and game applications.

Keywords

References

  1. J. A. Russell, "Emotion, core affect, and psychological construction," Cognition & Emotion, vol. 23, no. 7, pp. 1259-1283, 2009. https://doi.org/10.1080/02699930902809375
  2. W. K. Liao, D. Fidaleo, and G. Medioni, "Robust: real-time 3D face tracking from a monocular view," EURASIP Journal on Image and Video Processing, vol. 2010, article no. 5, 2010.
  3. J. Choi, G. Medioni, Y. Lin, L. Silva, O. Regina, M. Pamplona, and T. C. Faltemier, "3D face reconstruction using a single or multiple views," in Proceedings of the 20th International Conference on Pattern Recognition, Istanbul, Turkey, pp. 3959-3962, 2010.
  4. J. Choi, Y. Dumortier, S. I. Choi, M. B. Ahmad, and G. Medioni, "Real-time 3-D face tracking and modeling from a webcam," in Proceedings of the IEEE Workshop on Applications of Computer Vision, Breckenridge, CO, pp. 33-40, 2012.
  5. P. Viola and M. J. Jones, "Robust real-time face detection," International Journal of Computer Vision, vol. 7, no. 2, pp. 137-154, 2004.
  6. OpenCV: Open Source Computer Vision [Internet], Available: http://opencv.org/.
  7. T. F. Cootes and C. J. Taylor, "A mixture model for representing shape variation," in Proceedings of the 8th British Machine Vision Conference, Essex, UK, 1997.
  8. T. F. Cootes, G. J. Edwards, and C. J. Taylor, "Active appearance models," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 6, pp. 681-685, 2001. https://doi.org/10.1109/34.927467
  9. S. Milborrow and F. Nicolls, "Locating facial features with an extended active shape model," in Proceedings of the 10th European Conference on Computer Vision, Marseille, France, pp. 504-513, 2008.
  10. G. Medioni, J. Choi, C. H. Kuo, and D. Fidaleo, "Identifying noncooperative subjects at a distance using face images and inferred three-dimensional face models," IEEE Transactions on Systems, Man, and Cybernetics Part A: Systems and Humans, vol. 39, no. 1, pp. 12-24, 2009. https://doi.org/10.1109/TSMCA.2008.2007979
  11. T. F. Cootes, C. J. Taylor, D. H. Cooper, and J. Graham, "Active shape models: their training and application," Computer Vision and Image Understanding, vol. 61, no. 1, pp. 38-59, 1995. https://doi.org/10.1006/cviu.1995.1004
  12. K. Pearson, "On lines and planes of closest fit to systems of points in space," Philosophical Magazine, vol. 2, no. 6, pp. 559-572, 1901. https://doi.org/10.1080/14786440109462720
  13. J. Xiao, S. Baker, I. Matthews, and T. Kanade, "Real-time combined 2D+3D active appearance models," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, pp. 535-542, 2004.
  14. V. Blanz aand T. Vetter, "Face recognition based on fitting a 3D morphable model," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 9, pp. 106 3-1074, 2003. https://doi.org/10.1109/TPAMI.2003.1227983
  15. L. Gu and T. Kanade, "3D alignment of face in a single image," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, NY, pp. 1305-1312, 2006.
  16. D. Cristinacce and T. F. Cootes, "Feature detection and tracking with constrained local models," in Proceedings of the 17th British Machine Vision Conference, Edinburgh, UK, pp. 929-938, 2006.
  17. Z. Zhu and Q. Ji, "Robust real-time face pose and facial expression reecovery," in Proceedings of the IEEEE Computer Soociety Conference on Computer Vision and Pattern Recognition, New York, NY, ppp. 681-688, 20006.
  18. C. Vogler, Z. Li, A. Kanaujia, S. Goldenstein, and D. Metaxas, "The best of both worlds: combining 3D deformable models with active shape models," in Proceedings of the 11th IEEE International Conference on Computer Vision, Rio de Janeiro, Brazil, 2007.
  19. S. Taheri, P. Turaga, and R. Chellappa, "Towards view-invariant expression analysis using analytic shape manifolds," in Proceedings of the IEEE International Conference on Automatic Face & Gesture Recognition and Workshops, Santa Barbara, CA, pp. 306-313, 2011.
  20. W. K. Liao and G. Medioni, "3D face tracking and expression inference from a 2D sequence using manifold learning," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, 2008.
  21. S. T. Roweis andd L. K. Saul, "Nonlinear dimensionality reductionn by locally linear embedding," Science, vol. 290, no. 5500, pp. 2323-2326, 2000. https://doi.org/10.1126/science.290.5500.2323
  22. B. Lucas and T. Kanade, "An iterative image registration technique with an application to stereo vision (DARPA)," in Proceedings of the DARPA Imagge Understanding Workshop, Washington, DC, pp. 121-130, 1981.
  23. C. S. Myers and L. R. Rabiner, "A comparative study of several dynamic time-warping algorithms for connected word recognition," Bell System Techhnical Journal, vol. 60, no. 7, pp. 1389-1409, 1981. https://doi.org/10.1002/j.1538-7305.1981.tb00272.x
  24. H. Sakoe and S. Chiba, "Dynamic programming algorithm optimization for spoken word recognition," IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 26, no. 1, pp. 43-49, 1978. https://doi.org/10.1109/TASSP.1978.1163055

Cited by

  1. Studies of vision monitoring system using a background separation algorithm during radiotherapy vol.20, pp.2, 2016, https://doi.org/10.6109/jkiice.2016.20.2.359
  2. A Recognition of Lip Commands using a Motion Shape Descriptor vol.18, pp.11, 2013, https://doi.org/10.14801/jkiit.2020.18.11.1