DOI QR코드

DOI QR Code

Text extraction in images using simplify color and edges pattern analysis

색상 단순화와 윤곽선 패턴 분석을 통한 이미지에서의 글자추출

  • Yang, Jae-Ho (Dept of Plasma Bio Display, KwangWoon University) ;
  • Park, Young-Soo (Ingenium college of liberal arts, Kwangwoon university) ;
  • Lee, Sang-Hun (Ingenium college of liberal arts, Kwangwoon university)
  • 양재호 (광운대학교 대학원 플라즈마 바이오 디스플레이학과) ;
  • 박영수 (광운대학교 인제니움학부대학) ;
  • 이상훈 (광운대학교 인제니움학부대학)
  • Received : 2017.06.30
  • Accepted : 2017.08.20
  • Published : 2017.08.28

Abstract

In this paper, we propose a text extraction method by pattern analysis on contour for effective text detection in image. Text extraction algorithms using edge based methods show good performance in images with simple backgrounds, The images of complex background has a poor performance shortcomings. The proposed method simplifies the color of the image by using K-means clustering in the preprocessing process to detect the character region in the image. Enhance the boundaries of the object through the High pass filter to improve the inaccuracy of the boundary of the object in the color simplification process. Then, by using the difference between the expansion and erosion of the morphology technique, the edges of the object is detected, and the character candidate region is discriminated by analyzing the pattern of the contour portion of the acquired region to remove the unnecessary region (picture, background). As a final result, we have shown that the characters included in the candidate character region are extracted by removing unnecessary regions.

본 논문은 이미지에서 효과적인 문자검출을 위해 색상단순화 및 윤곽선에서의 패턴 분석을 통한 문자 검출방법을 제안한다. 윤곽선 기반방법을 사용하는 문자검출 알고리즘은 단순한 배경의 이미지에서는 우수한 성능을 보이지만, 복잡한 배경의 이미지에서는 성능이 떨어지는 단점이 있다. 따라서 제안하는 방법은 복잡한 배경에서의 비문자영역을 최소화하기 위해 이미지 단순화 및 패턴분석을 통한 문자 검출 알고리즘을 제안한다. 먼저 이미지에서의 문자영역 부분을 검출하기 위하여 전처리 과정으로 K-means 군집화를 사용하여 이미지의 색상을 단순화하고, 색상 단순화 과정에서의 물체의 경계의 흐릿해짐을 개선하기 위해 고주파통과필터를 통해 물체의 경계를 강화한다. 그 후 모폴로지 기법의 팽창과 침식의 차이를 이용하여 물체의 윤곽선을 검출하고, 획득한 영역의 윤곽선 부분의 정보(높이, 너비 면적)를 구한 후 패턴분석을 통해 조건을 줌으로써 문자 후보영역을 판별하여 문자가 아닌 불필요한 영역(그림, 배경)을 제거한다. 최종 결과로 라벨링을 통해 불필요한 영역이 제거된 결과를 보여준다.

Keywords

References

  1. C.P.Sumathi, T.Santhanam, N.Priya, "Techniques and challenges of automatic text extraction in complex images: a survey", Journal of Theoretical and Applied Information Technology, Vol. 35, No. 2, pp. 225-235, 2012.
  2. MS.Uddin, T.Rahman, US.Busra, M.Sultana, "Automated extraction of text from images using morphology based approach", International Journal of Electronics & Informatics, Vol. 1, No. 1, pp. 14-19 2012.
  3. T.Khatib, H.Karajeh, H.Mohammad, L.Rajab, "A hybrid multilevel text extraction algorithm in scene images", AcademicJounals, Vol. 55, No. 2, pp. 105-113, 2013.
  4. C.Strouthpoulos, N.Papamarkos, A.E.Atsalakis, "Text extraction in complex color document", Pattern Recognition, Vol. 35, No. 8, pp. 1742-1758, 2002.
  5. BB.Sharma, A.Joshi, RK.Sharma, M.Jahan, "Text data extraction from images of number plate and sign boards", International Journal of Electronics, Electrical and Computational System, Vol. 4, No. 9, 2015.
  6. P.Shivakumara, T.Q.Phan, C.L.Tan, "A Laplacian approach to multi-oriented text detection in video" IEEE Trans. Pattern Anal. Mach. Intell, Vol. 33, No. 2, pp. 412-419, 2011. https://doi.org/10.1109/TPAMI.2010.166
  7. Z.Guo, Y.Li, Y.Wang, S.Liu, T.Lei, Y.Fan, "A method of effective text extraction for complex video scene"Mathematical Problems in Engineering Volume 2016 , pp. 1-11, 2016.
  8. B.Gatos, I.Pratikakis, K.Kepene, S.J.Perantonis, "Text detection in indoor/outdoor scene images" First workshop of camera-based document analysis and recognition, pp. 127-132, 2005.
  9. GG.Devi, CP.Sumathi, "Text extraction from images using gamma correction method and different text extraction methods - a comparative analysis", Journal of Computer Science, vol. 10, no. 4, pp. 705-715, 2014. https://doi.org/10.3844/jcssp.2014.705.715
  10. Z.Huang, J.Leng, "Text extraction in natural scenes using region-based method", Journal of Digital Information Management, Vol. 12, No. 4, pp. 246-254, 2014.
  11. X..Zhang, F.Sun, L.Gu, "A Combined Algorithm for video text extraction", 2010 Seventh International Conforence on Fuzzy Systeims and konwledge dicovery, Vol. 5, No, 10, pp. 2294-2298, 2010.
  12. B.Wang, XF.Li, F.Liu, F.Q. Hu, "Color text image binarization based on binary texture analysis", Pattern recognition letters, Vol. 26, No. 10, pp. 1568-1576, 2005. https://doi.org/10.1016/j.patrec.2004.12.004
  13. E.Rhee, "Security Algorithm for Vehicle Type Recognition," Journal of Convergence for Information Technology, Vol. 7, No. 2, pp. 77-82, 2017. https://doi.org/10.22156/CS4SMB.2017.7.2.077
  14. M.S.Choi, "Complex Color Model for Efficient Representation of Color-Shape in Content-based Image Retrieval", Journal of digital Convergence , Vol. 15, No. 4, pp. 267-273, 2017. https://doi.org/10.14400/JDC.2017.15.4.267
  15. J,H.Park, G.S.Lee, S.H.Lee, "A Study on the Convergence Technique enhanced GrabCut Algorithm Using Color Histogram and modified Sharpening filter", Journal of the Korea Convergence Society, Vol. 6, No. 6, pp. 1-8, 2015. https://doi.org/10.15207/JKCS.2015.6.6.001
  16. G.O. Kim, G.S. Lee, S.H. Lee, "An Edge Extraction Method Using K-means Clustering In Images", The Society of Digital Policy and Management, Vol. 12, No. 11, pp. 281-288, 2014.
  17. J.H.Kim, S.H.Lee, G.S.Lee, Y.S.Park, Y.P.Hong, "Using a Method Based on a Modified K-Means Clustering and Mean Shift Segmentation to Reduce File Sizes and Detect Brain Tumors from Magnetic Resonance (MRI) Images." Wireless Personal Communications, Vol. 89, No. 3, pp. 993-1008, 2016. https://doi.org/10.1007/s11277-016-3420-8