DOI QR코드

DOI QR Code

Document Image Layout Analysis Using Image Filters and Constrained Conditions

이미지 필터와 제한조건을 이용한 문서영상 구조분석

  • Jang, Dae-Geun (Telecommunication Network Research Lab, Dept.of Electronics Electrical Computer,Graduate School of Kyungpook National University) ;
  • Hwang, Chan-Sik (Telecommunication Network Research Lab, Dept.of Electronics Electrical Computer, Kyungpook National University)
  • 장대근 (경북대학교 대학원 전자·전기·컴퓨터학부) ;
  • 황찬식 (경북대학교 전자·전기·컴퓨터학부)
  • Published : 2002.06.01

Abstract

Document image layout analysis contains the process to segment document image into detailed regions and the process to classify the segmented regions into text, picture, table or etc. In the region classification process, the size of a region, the density of black pixels, and the complexity of pixel distribution are the bases of region classification. But in case of picture, the ranges of these bases are so wide that it's difficult to decide the classification threshold between picture and others. As a result, the picture has a higher region classification error than others. In this paper, we propose document image layout analysis method which has a better performance for the picture and text region classification than that of previous methods including commercial softwares. In the picture and text region classification, median filter is used in order to reduce the influence of the size of a region, the density of black pixels, and the complexity of pixel distribution. Futhermore the classification error is corrected by the use of region expanding filter and constrained conditions.

문서영상 구조분석은 문서영상을 세부 영역으로 분할하는 과정과 분할된 영역을 문자, 그림, 표 등으로 분류하는 과정을 포함한다. 이 중 영역분류 과정에서 영역의 크기, 흑화소 밀도, 화소 분포의 복잡도는 영역을 분류하는 기준이 된다. 그러나 그림의 경우 이러한 기준들의 범위가 넓어 경계를 정하기 어려우므로 다른 형태에 비해 상대적으로 오분류의 비율이 높다. 본 논문에서는 그림과 문자를 분류하는 과정에서 영역의 크기, 흑화소 밀도, 화소 분포의 복잡도에 의한 영향을 줄이기 위하여 메디안 필터를 이용하고, 영역확장 필터(region expanding filter)와 제한 조건들을 이용하여 영역분류에서의 오분류를 수정함으로써 상용제품을 포함한 기존 방법에 비해 그림과 문자의 분류가 우수한 문서영상 구조 분석 방법을 제안한다.

Keywords

References

  1. X. Li, W. Gao, S.Y. Chi, K.A. Moon and H.J. Kim, 'An Efficient Method for Page Segmentation,' Proc. ICICS, Vol.2, pp.957-961, 1997 https://doi.org/10.1109/ICICS.1997.652121
  2. D. Drivas and A. Amin, 'Page Segmentation and Classification Utilizing Bottom-up Approach,' Proc. ICDAR, pp.610-614, 1995
  3. K. Kise, M. Iwata and K. Matsumoto, 'A Computational Geometric Approach to Text-line Extraction from Binary Document Images,' Proc 3th Int. Work Doument Analysis System, pp.346-355, 1998
  4. H. Fujisawa and Y. Nakano, 'A Top-Down Approach for the Analysis of Document Images,' Proc. Work. Syntatic and Structural Pattern Recognition, Murray Hill, USA, pp.113-122, 1990
  5. Y.Y. Tang, C.Y. Suen, C.D. Yan and M. Cheriet, 'Document Analysis and understang : A Brief Survey,' Proc. 1st Int. Conf. Document Analysis and Recognition, Saint-Malo, France, pp.17-31, 1991
  6. S.K. Yip and Z. Chi, 'Page Segmentation and Content Classification for Automatic Document Image Processing,' Proc. Int. Symp. Intelligent Multimedia, Video and Speech Processing, pp.279-282, 2001 https://doi.org/10.1109/ISIMP.2001.925388
  7. J. Kong and Z. Chi, 'Image Classification Using Kolmogorov Complexity Measure with Extracted Blocks,' IEICE Trans. Inf. & Syst., Vol.1, E81-D, pp.1239-1246, 1998
  8. Mario I. Chacon Murguia, 'Document Segmentation Using Texture Variance and Low Resolution Images,' IEEE Southwest. Symp. Image Analysis and Interpretation, pp.164-167, 1998 https://doi.org/10.1109/IAI.1998.666879

Cited by

  1. A method for automatically translating print books into electronic Braille books vol.59, pp.7, 2016, https://doi.org/10.1007/s11432-016-5575-z
  2. Text Area Extraction Method for Color Images Based on Labeling and Gradient Difference Method vol.11, pp.12, 2011, https://doi.org/10.5392/JKCA.2011.11.12.511