Efficient Text Localization using MLP-based Texture Classification

신경망 기반의 텍스춰 분석을 이용한 효율적인 문자 추출

  • Published : 2002.04.01

Abstract

We present a new text localization method in images using a multi-layer perceptron(MLP) and a multiple continuously adaptive mean shift (MultiCAMShift) algorithm. An automatically constructed MLP-based texture classifier generates a text probability image for various types of images without an explicit feature extraction. The MultiCAMShift algorithm, which operates on the text probability Image produced by an MLP, can place bounding boxes efficiently without analyzing the texture properties of an entire image.

본 논문은 MLP와 MultiCAMShift 알고리즘을 이용한 텍스춰 기반의 영상 내 문자 추출 방법을 제안한다. MLP를 이용한 텍스춰 분석기는 별도의 특징값 추출 단계 없이 다양한 환경의 입력 영상에 대해 효과적으로 문자 확률 영상을 생성하며, 문자 확률 영상 상에서 수행되는 MultiCAMShift 알고리즘은 국소 탐색만으로 효율적으로 문자 영역을 추출할 수 있다.

Keywords

References

  1. Rainer Lienhart and Frank Stuber, 'Automatic Text Recognition In Digital Videos,' SPIE-The International Society for Optical Engineering, pp. 180-188, 1996 https://doi.org/10.1117/12.234741
  2. Hae-Kwang Kim, 'Efficient Automatic Text Location Method and Content-Based Indexing and Structuring of Video Database,' Journal of visual communication and image representation, Vol. 7, No.4, December, pp. 336-344, 1996 https://doi.org/10.1006/jvci.1996.0029
  3. Huiping Li, David Doerman, and Omid Kia, 'Automatic Text Detection and Tracking in Digital Video,' IEEE Transactions on Image Processing, Vol. 9, No.1, pp.147-I56, 2000 https://doi.org/10.1109/83.817607
  4. Yu Zhong, Hongjiang Zhang, and Anil K. Jain, 'Automatic Caption Localization in Compressed Video,' IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, No.4, 2000 https://doi.org/10.1109/34.845381
  5. Anil. K. Jain, and Bin Yu, 'Automatic Text Location in Images and Video Frames,' Pattern Recognition, Vol. 31, No. 12, pp.2055-2076, 1998 https://doi.org/10.1016/S0031-3203(98)00067-3
  6. E.Y. Kim, K.Jung, K.Y.Jeong, and H.J.Kim, 'Automatic Text Region Extraction Using Cluster-based Templates,' International Conference on Advances in Pattern Recognition and Digital Techniques, pp. 418-421, 2000
  7. Yu Zhong, Kalle Karu, and Anil K Jain, 'Locating Text In Complex Color Images,' Pattern Recognition, Vol. 28, No. 10, pp. 1523-1535, 1995 https://doi.org/10.1016/0031-3203(95)00030-4
  8. K. Y. Jeong, K. Jung, E. Y. Kim, and H. J. Kim, 'Neural Network-based Text Location for News Video Indexing,' Proceedings of International Conference of Image Processing, 1999 https://doi.org/10.1109/ICIP.1999.817127
  9. Yassin M. Y. Hasan and Lina J. Karam, 'Morphological Text Extraction from Images,' IEEE Transactions on Image Processing, Vol. 9, No. 11, pp. 1978-1983, 2000 https://doi.org/10.1109/83.877220
  10. S. Messelodi and C. M. Modena, 'Automatic Identifacation and Skew Estimation of Text Lines in Real Scene Images,' Pattern Recognition, Vol. 32, pp. 791-810, 1999 https://doi.org/10.1016/S0031-3203(98)00108-3
  11. Victor Wu, Raghavan Manmatha, and Edward M. Riseman, 'TextFinder An Automatic System to Detect and Recognize Text in Images,' IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 21. No. 11, pp. 1224-1229, 1999 https://doi.org/10.1109/34.809116
  12. C. Strouthopoulos and N.Papamarkos, 'Text Identification For Document Image Analysis Using a Neural Network.' Image and Vision Computing, Vol. 16, pp, 879-896, 1998 https://doi.org/10.1016/S0262-8856(98)00055-9
  13. Keechul Jung, 'Neural Network-based Text Location using Color Texture Discrimination,' PhD. Thesis, Artificial Intelligence Laboratory, Kyung-pook National University, Korea, December 1999
  14. Huiping Li and David Doermann, 'A Video Text Detect System based on Automated Training,' International Conference on Pattern Recognition, pp.223-226, 2000 https://doi.org/10.1109/ICPR.2000.906053
  15. Axel Wernicle and Rainer Lienhart, 'On the Segmentation of Text in Videos,' IEEE International Conference on Multimedia and Expo, Vol. 3, pp. 1511-1514, 2000 https://doi.org/10.1109/ICME.2000.871054
  16. Ullas Gargi, Sameer Antani, and Rangachar Kasturi, 'Indexing Text Events in Digital Video Database,' International Conference on Pattern Recognition, pp. 1481-1483, 1998
  17. K. K. Sung and T. Poggio, 'Example-based learning for view-based human face detection,' IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 1, pp. 39-51, 1998 https://doi.org/10.1109/34.655648
  18. Anil K Jain and Kalle Karu, 'Learning Texture Discrimination Masks,' IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 18, No.2, pp. 195-205, 1996 https://doi.org/10.1109/34.481543
  19. Yizong Cheng, 'Mean Shift, Mode Seeking, and Clustering,' IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17, No.8, August, pp.790-799, 1995 https://doi.org/10.1109/34.400568
  20. Gary R. Bradski and Vadim Pisarevsky, 'Intel's Computer Vision Library: Application in Calibration, Stereo, Segmentation, Tracking, Gesture, Face and Object Recognition,' Proceedings of IEEE Conference of Computer Vision and Pattern Recognition, Vol. 2, pp. 796-797, 2000 https://doi.org/10.1109/CVPR.2000.854964
  21. Dorin Comaniciu and Visvanathan Ramesh, 'Robust Detection and Tracking of Human Faces with an Active Camera,' The 3rd IEEE International Workshop on Visual Surveillance, pp.11-18, 2000 https://doi.org/10.1109/VS.2000.856853
  22. Sameer Antani, Ullas Gargi, David Crandall, Tarak Gandhi, and Rangachar Kasturi, 'Extraction of Text in Video,' Technocal Report, CSE-99-016, August 30, 1999
  23. B.K.P. Horn, Robot Vision. MIT Press, 1986