DOI QR코드

DOI QR Code

Integrated Method for Text Detection in Natural Scene Images

  • Zheng, Yang (School of Automation and Electrical Engineering, University of Science and Technology Beijing) ;
  • Liu, Jie (Institute of Automation, Chinese Academy of Sciences) ;
  • Liu, Heping (School of Automation and Electrical Engineering, University of Science and Technology Beijing) ;
  • Li, Qing (School of Automation and Electrical Engineering, University of Science and Technology Beijing) ;
  • Li, Gen (Institute of Automation, Chinese Academy of Sciences)
  • Received : 2016.05.07
  • Accepted : 2016.10.12
  • Published : 2016.11.30

Abstract

In this paper, we present a novel image operator to extract textual information in natural scene images. First, a powerful refiner called the Stroke Color Extension, which extends the widely used Stroke Width Transform by incorporating color information of strokes, is proposed to achieve significantly enhanced performance on intra-character connection and non-character removal. Second, a character classifier is trained by using gradient features. The classifier not only eliminates non-character components but also remains a large number of characters. Third, an effective extractor called the Character Color Transform combines color information of characters and geometry features. It is used to extract potential characters which are not correctly extracted in previous steps. Fourth, a Convolutional Neural Network model is used to verify text candidates, improving the performance of text detection. The proposed technique is tested on two public datasets, i.e., ICDAR2011 dataset and ICDAR2013 dataset. The experimental results show that our approach achieves state-of-the-art performance.

Keywords

References

  1. L. Neumann and J. Matas, "On combining multiple segmentations in scene text recognition," in Proc. of 12th International Conference on Document Analysis and Recognition, pp.523-527, August, 2013.
  2. X.-C. Yin, X. Yin, K. Huang and H.-W. Hao, "Robust text detection in natural scene images," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.36, no.5, pp.970-983, September, 2014. https://doi.org/10.1109/TPAMI.2013.182
  3. Y.-F. Pan, X. Hou and C.-L. Liu, "A hybrid approach to detect and localize texts in natural scene images," IEEE Transactions on Image Processing, vol.20, no.3, pp.800-813, March, 2011. https://doi.org/10.1109/TIP.2010.2070803
  4. C. Yao, X. Zhang, X. Bai, W. Liu, Y. Ma and Z. Tu, "Rotation-invariant features for multi-oriented text detection in natural images," PLOS One, vol.8, no.8, August, 2013.
  5. X. Chen and A.L. Yuille, "Detecting and reading text in natural scenes," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp.366-373, June 27 - July 2, 2004.
  6. K. Wang and S. Belongie, "Word spotting in the wild," in Proc. of 11th European Conference on Computer Vision, pp.591-604, September 5-11, 2010.
  7. L. Neumann and J. Matas. "Scene text localization and recognition with oriented stroke detection," in Proc. of IEEE International Conference on Computer Vision, pp.97-104, December 1-8, 2013.
  8. B. Epshtein, E. Ofek and Y.Wexler, "Detecting text in natural scenes with stroke width transform," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp.2963-2970, June 13-18, 2010.
  9. L. Neumann and J. Matas, "A method for text localization and recognition in real-world images," in Proc. of Asian Conference on Computer Vision, pp.770-783, November 8-12, 2010.
  10. W.-L. Huang, Z. Lin, J. Yang and J. Wang, "Text localization in natural images using stroke feature transform and text covariance descriptors," in Proc. of IEEE International Conference on Computer Vision, pp.1241-1248, December 1-8, 2013.
  11. C.-L. Liu, K. Nakashima, H. Sako and H. Fujisawa, "Handwritten digit recognition: investigation of normalization and feature extraction techniques," Pattern Recognition, vol.37, no.2, pp.265-279, February, 2004. https://doi.org/10.1016/S0031-3203(03)00224-3
  12. H. Zhang, K. Zhao, Y.-Z. Song and J. Guo, "Text extraction from natural scene image: A survey," Neurocomputing, vol.122, no.51, pp.310-323, December, 2013. https://doi.org/10.1016/j.neucom.2013.05.037
  13. L. Neumann and J. Matas, "Text localization in real-world images using efficiently pruned exhaustive search," in Proc. of International Conference on Document Analysis and Recognition, pp.687-691, September 18-21, 2011.
  14. L. Neumann and J. Matas, "Real-time scene text localization and recognition," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp.3538-3545, June 16-21, 2012.
  15. L. Sun and Q. Huo, "A component-tree based method for user-intention guided text extraction," in Proc. of 21th International Conference on Pattern Recognition, pp.633-636, November 11-15, 2012.
  16. L. Sun and Q. Huo, "An improved component tree based approach to user intention guided text extraction from natural scene images," in Proc. of 12th International Conference on Document Analysis and Recognition, pp.383-387, August 25-28, 2013.
  17. W.-L. Huang, Y. Qiao and X.-O. Tang, "Robust scene text detection with convolution neural network induced mser trees," in Proc. of 13th European Conference on Computer Vision, pp.497-511, September 6-12, 2014.
  18. A. Krizhevsky, I. Sutskever and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," in Proc. of Advances in Neural Information Processing Systems, pp.1097-1105, December 3-8, 2012.
  19. Y.-L. Cun, B.-E. Boser, J.-S. Denker, D. Henderson, R.-E. Howard and et al, "Handwritten digit recognition with a back-propagation network," in Proc. of Advances in Neural Information Processing Systems, pp.396-404, November 26-29, 1990.
  20. C.- Y. Lee, S. Xie, P. Gallagher, Z. Zhang and Z. Tu, "Deeply-supervised nets," in Proc. of 18th International Conference on Artificial Intelligence and Statistics, pp.562-570, May 9-12, 2015.
  21. T. Wang, D. J. Wu, A. Coates and A. Y. Ng, "End-to-end text recognition with convolutional neural networks," in Proc. of 21th International Conference on Pattern Recognition, pp.3304-3308, November 11-15, 2012.
  22. M. Jaderberg, A. Vedaldi, and A. Zisserman, "Deep features for text spotting," in Proc. of 13th European Conference on Computer Vision, pp.512-528, September 6-12, 2014.
  23. C. Yi and Y. Tian, "Text string detection from natural scenes by structure based partition and grouping," IEEE Transactions on Image Processing, vol.20, no.9, pp.2594-2605, September, 2011. https://doi.org/10.1109/TIP.2011.2126586
  24. C. Yao, X. Bai, W. Liu, Y. Ma and Z. Tu, "Detecting texts of arbitrary orientations in natural images," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp.1083-1090, June 16-21, 2012.
  25. A. Shahab, F. Shafait and A. Dengel, "Icdar 2011 robust reading competition challenge 2: reading text in scene images," in Proc. of International Conference on Document Analysis and Recognition, pp.1491-1496, September 18-21, 2011.
  26. D. Karatzas, F. Shafait, S. Uchida and M. Iwamura, "Icdar 2013 robust reading competition," in Proc. of 12th International Conference on Document Analysis and Recognition, pp.1484-1493, August 25-28, 2013.
  27. C. Wolf and J.-M. Jolion, "Object count/area graphs for the evaluation of object detection and segmentation algorithms," International Journal of Document Analysis and Recognition, vol.8, no.4, pp.280-296, September, 2006. https://doi.org/10.1007/s10032-006-0014-0
  28. S. Tian, Y. Pan, C. Huang, S. Lu, K. Yu and C. L. Tan, "Text flow: a unified text detection system in natural scene images," in Proc. of IEEE International Conference on Computer Vision, pp.4651-4659, December 7-13, 2015.
  29. Z. Zhang, W. Shen, C. Yao and X. Bai, "Symmetry-based text line detection in natural scenes," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp.2558-2567, June 7-12, 2015.
  30. A. Zamberletti, L. Noce and I. Gallo, "Text localization based on fast feature pyramids and multi-resolution maximally stable extremal regions," in Proc. of Asian Conference on Computer Vision, pp.91-105, November 1-5, 2015.
  31. C. Yao, X. Bai and W. Liu, "A unified framework for multioriented text detection and recognition," IEEE Transactions on Image Processing, vol.23, no.11, pp.4737-4749, November, 2014. https://doi.org/10.1109/TIP.2014.2353813
  32. C. Shi, C. Wang, B. Xiao, Y. Zhang and S. Gao, "Scene text detection using graph model built upon maximally stable extremal regions," Pattern Recognition Letters, vol.34, no.2, pp.107-116, January, 2013. https://doi.org/10.1016/j.patrec.2012.09.019
  33. S. Lu, T. Chen, S. Tian, J.-H. Lim and C.-L. Tan, "Scene text extraction based on edges and support vector regression," International Journal on Document Analysis and Recognition, vol.18, no 2, pp.125-135, June, 2015. https://doi.org/10.1007/s10032-015-0237-z