Performance Improvement of Optical Character Recognition for Parts Book Using Pre-processing of Modified VGG Model

Shin, Hee-Ran;Lee, Sang-Hyeop;Park, Jang-Sik;Song, Jong-Kwan;

doi:10.13067/JKIECS.2019.14.2.433

The Journal of the Korea institute of electronic communication sciences (한국전자통신학회논문지)

Volume 14 Issue 2
/
Pages.433-438
/
2019
/
1975-8170(pISSN)

Korea Institute of Electronic Communication Science (한국전자통신학회)

DOI QR Code

Performance Improvement of Optical Character Recognition for Parts Book Using Pre-processing of Modified VGG Model

변형 VGG 모델의 전처리를 이용한 부품도면 문자 인식 성능 개선

Shin, Hee-Ran ;
Lee, Sang-Hyeop ;
Park, Jang-Sik (Dept. Department of Electronic Engineering, Kyungsung University) ;
Song, Jong-Kwan

신희란 (경성대학교 전자공학과) ;
이상협 (경성대학교 전자공학과) ;
박장식 (경성대학교 전자공학과) ;
송종관 (경성대학교 전자공학과)

Received : 2019.03.07
Accepted : 2019.04.15
Published : 2019.04.30

https://doi.org/10.13067/JKIECS.2019.14.2.433 Citation PDF KSCI HTML

Download PDF

⟨ Previous Next ⟩

Abstract

This paper proposes a method of improving deep learning based numbers and characters recognition performance on parts of drawing through image preprocessing. The proposed character recognition system consists of image preprocessing and 7 layer deep learning model. Mathematical morphological filtering is used as preprocessing to remove the lines and shapes which causes false recognition of numbers and characters on parts drawing. Further.. Further, the used deep learning model is a 7 layer deep learning model instead of VGG-16 model. As a result of the proposed OCR method, the recognition rate of characters is 92.57% and the precision is 92.82%.

본 논문에서는 기계 서비스 부품 도면에서 숫자를 인식하기 위하여 입력 영상에 대한 전처리와 딥러닝 모델을 제안한다. 서비스 부품 도면의 숫자를 인식하는데 있는 지시선과 도형에 의한 오검출 또는 오인식을 개선하기 위하여 수학적 형태학 필터링 전처리를 한다. 숫자 인식을 위하여 VGG-16 모델을 축소 변형한 7 개의 계층을 가지는 VGG 모델을 적용함으로써 인식 성능을 개선한다. 서비스 부품 도면의 숫자 인식 실험 결과, 제안하는 방법이 인식률 95.57%, 정확도는 92.82%로 종래의 방법에 현저히 개선된 결과를 얻었다.

Keywords

KCTSAD_2019_v14n2_433_f0001.png 이미지

그림 1. 축소 변형 VGG 모델 구조 Fig. 1 Structure of mini-VGG model

KCTSAD_2019_v14n2_433_f0002.png 이미지

그림 2. 제안하는 OCR 구조 Fig. 2 The proposed OCR structure

KCTSAD_2019_v14n2_433_f0003.png 이미지

그림 3. 수학적 형태학 연산 처리 결과 (a) 침식연산, (b) 팽창연산 Fig. 3 Results of mathematical morphology (a) Erosion, (b) Dilation

KCTSAD_2019_v14n2_433_f0004.png 이미지

그림 4. 서비스 부품도면에 대한 OCR 및 전처리 결과. (a) 원본 영상, (b) OCR 처리 결과 (c) 수학적 형태학 필터링 적용 결과 Fig. 4 Results of OCR and pre-processing. (a) original image, (b) a result of OCR (c) a result of mathematical morphology filtering

KCTSAD_2019_v14n2_433_f0005.png 이미지

그림 5. 제안한 OCR 결과 Fig. 5 The Result of the proposed OCR

표 1. 축소 변형 VGG 구조 Table 1. Mini VGG architecture

KCTSAD_2019_v14n2_433_t0001.png 이미지

표 2. 부품도면 OCR 성능 비교 Table 2. Performance comparison of OCR of service parts book

KCTSAD_2019_v14n2_433_t0002.png 이미지

References

J. Jo and H. Yang, "A Car License Plate Recognition Using Colors Information, Morphological Characteristic and Neural Network," J. of the Korea Institute of Electronic Communication Sciences, vol. 25, no. 1, June 2010, pp. 304-308.
S. Park "An Effective Method of Product Number Detection from Thick Plates," J. of the Korea Institute of Electronic Communication Sciences, vol. 25, no. 1, Jan. 2015, pp. 139-148. https://doi.org/10.13067/JKIECS.2015.10.1.139
S. Kim, S. Lee, S. Lee, and S. Lee "Household storage service through Optical Character Recognition(OCR)," In Proc. Communications of the Korean Institute of Information Scientists and Engineers, Pusan, Korea, Dec. 2017, pp. 377-379.
P. A. Wankhede, and S. W. Mohod, "A different image content-based retrievals using OCR techniques," Proc. of Int. Conf. of Electronics, Communication and Aerospace Technology, Coimbaore, India, Apr. 2017, pp. 20-22.
R. Smith, "An overview of the tesseract OCR engine," Proc. of Int. Conf. on Document Analysis and Recognition, Parana, Brazil, Sept. 2007, pp. 629-633.
X. Zhang, J. Zou, K. He, and J. Sun, "Accelerating very deep convolutional networks for classification and detection," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 38, 2015, pp. 1943-1955 https://doi.org/10.1109/TPAMI.2015.2502579
K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," Proc. of Int. Conf. Learning Representations ICLR 2015, San Diego, USA, May 2015.
X. C. Yin, X. Yin, K. Huang, and H. W. Hao, "Robust text detection in natural scene images," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 36, no. 5, 2014, pp. 970-983. https://doi.org/10.1109/TPAMI.2013.182
R. M. Haralick, "Image analysis using mathematical morphology," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. PAMI-9, no. 4, July 1987, pp. 532-550, https://doi.org/10.1109/TPAMI.1987.4767941
L. Vincent, "Morphological area openings and closings for greyscale images," Proc. NATO Shape in Picture Workshop, Driebergen, The Netherlands, Sept. 1992, pp. 197-208.

The Journal of the Korea institute of electronic communication sciences (한국전자통신학회논문지)

Performance Improvement of Optical Character Recognition for Parts Book Using Pre-processing of Modified VGG Model

변형 VGG 모델의 전처리를 이용한 부품도면 문자 인식 성능 개선

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)