Dynamic Hand Gesture Recognition Using CNN Model and FMM Neural Networks

CNN 모델과 FMM 신경망을 이용한 동적 수신호 인식 기법

  • Kim, Ho-Joon (School of Computer Science and Eletric Engineering, Handong University)
  • 김호준 (한동대학교 전산전자공학부)
  • Received : 2010.05.10
  • Accepted : 2010.06.02
  • Published : 2010.06.30

Abstract

In this paper, we present a hybrid neural network model for dynamic hand gesture recognition. The model consists of two modules, feature extraction module and pattern classification module. We first propose a modified CNN(convolutional Neural Network) a pattern recognition model for the feature extraction module. Then we introduce a weighted fuzzy min-max(WFMM) neural network for the pattern classification module. The data representation proposed in this research is a spatiotemporal template which is based on the motion information of the target object. To minimize the influence caused by the spatial and temporal variation of the feature points, we extend the receptive field of the CNN model to a three-dimensional structure. We discuss the learning capability of the WFMM neural networks in which the weight concept is added to represent the frequency factor in training pattern set. The model can overcome the performance degradation which may be caused by the hyperbox contraction process of conventional FMM neural networks. From the experimental results of human action recognition and dynamic hand gesture recognition for remote-control electric home appliances, the validity of the proposed models is discussed.

본 연구에서는 동영상으로부터 동적 수신호 패턴을 효과적으로 인식하기 위한 방법론으로서 복합형 신경망 모델을 제안한다. 제안된 모델은 특징추출 모듈과 패턴분류 모듈로 구성되는데, 이들 각각을 위하여 수정된 구조의 CNN 모델과, WFMM 모델을 도입한다. 또한 목표물의 움직임 정보에 기초한 시공간적 템플릿 구조의 데이터표현을 소개한다. 본 논문에서는 우선 수신호 패턴 데이터에서 특징점의 시간적 변이 및 공간적 변이에 의한 영향을 보완하기 위하여 3차원 수용영역 구조로 확장된 CNN 모델을 제시한다. 이어서 패턴분류 단계를 위하여 가중치를 갖는 구조의 FMM 신경망 모델을 소개하고, 신경망의 구조와 동작특성에 관해 기술한다. 또한 제안된 모델이 기존의 FMM 신경망에서 중첩 하이퍼박스의 축소과정에서 발생하는 학습효과의 왜곡현상을 개선할 수 있음을 보인다. 응용으로 가전제품 원격제어 문제를 전제하여 간략화된 수신호패턴 인식 문제에 적용한 실험결과로부터 제안된 이론의 타당성을 고찰한다.

Keywords

References

  1. Markus Vincze et al., "Integrated Vision System for the Semantic Interpretation of Activities Where a Person Handles Objects", Computer Vision and Image Understanding, Vol.113, No.1(2009), 582-692.
  2. Xiaofei Ji and Honghai, "Advances in View-Invariant Human Motion Analysis : A Review", IEEE Transaction on Systems, Man and Cybernetics, Part C, Vol.40, No.1(2010), 13-14.
  3. Ronald Poppe, "Vision-Based Human Motion Analysis: An overview", Computer Vision and Image Understanding, Vol.108, No.1(2007), 4-18. https://doi.org/10.1016/j.cviu.2006.10.016
  4. Cristophe Garcia, Manolis Delakis : Convolutional Face Finder : A Neural Architecture for Fast and Robust Face Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.26, No.11(2004), 1408-1423. https://doi.org/10.1109/TPAMI.2004.97
  5. Steve Lawrence, C. Lee Giles, Ah Chung Tsoi, Andrew D. Back : Face Recognition : A Convolutional Neural Network Approach, IEEE Transactions on Neural Networks, Vol.8, No.1(1997), 98-113. https://doi.org/10.1109/72.554195
  6. P. K. Simpson, "Fuzzy Min-Max Neural Networks-Part 1 : Classification", IEEE Transactions on Neural Networks, Vol.3, No.5(1992), 776-786. https://doi.org/10.1109/72.159066
  7. P. K. Simpson, "Fuzzy Min-Max Neural Networks-Part 2 : Clustering", IEEE Transactions on Fuzzy Systems, Vol.1, No.1(1993), 32-45. https://doi.org/10.1109/TFUZZ.1993.390282
  8. Bogdan Gabrys, Andrzej Bargiela, "General Fuzzy Min-Max Neural Network for Clustering and Classification", IEEE Transactions on Neural Networks, Vol.11, No.3(2000), 769-783. https://doi.org/10.1109/72.846747
  9. Ho-Joon Kim, Juho Lee and Hyun-Seung Yang, "Robust Realtime Face Detectgion using Hybrid Neural Networks", Proceeding of 2006 International Comference on Intelligent Computing(ICIC2006), Vol.1(2006), 721-730.
  10. 이조셉, 박진희, 김호준, "동적 수신호 인식을 위한 복합형 신경망 모델", 2007 한국컴퓨터종합학술대회논문집, 1권(2007), 287-292.
  11. Alper Yilmaz, Mubarak Shah, "Actions Sketch : A Novel Action Representation", Proceeding of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol.1(2005), 984-989.
  12. Daniel Weinland, Remi Ronfard, Edmond Boyer, "Free Viewpoint Action Recognition using Motion History Volumes", Computer Vision and Image Understanding, Vol.104, No.1(2006), 249-257. https://doi.org/10.1016/j.cviu.2006.07.013