Automatic Dataset Generation of Object Detection and Instance Segmentation using Mask R-CNN

Jo, HyunJun;Kim, Dawit;Song, Jae-Bok;

doi:10.7746/jkros.2019.14.1.031

The Journal of Korea Robotics Society (로봇학회논문지)

Volume 14 Issue 1
/
Pages.31-39
/
2019
/
1975-6291(pISSN)
/
2287-3961(eISSN)

Korea Robotics Society (한국로봇학회)

DOI QR Code

Automatic Dataset Generation of Object Detection and Instance Segmentation using Mask R-CNN

Mask R-CNN을 이용한 물체인식 및 개체분할의 학습 데이터셋 자동 생성

Jo, HyunJun (Korea University) ;
Kim, Dawit (Korea University) ;
Song, Jae-Bok (Mechanical Engineering, Korea University)

Received : 2018.12.10
Accepted : 2019.01.15
Published : 2019.02.28

https://doi.org/10.7746/jkros.2019.14.1.031 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

A robot usually adopts ANN (artificial neural network)-based object detection and instance segmentation algorithms to recognize objects but creating datasets for these algorithms requires high labeling costs because the dataset should be manually labeled. In order to lower the labeling cost, a new scheme is proposed that can automatically generate a training images and label them for specific objects. This scheme uses an instance segmentation algorithm trained to give the masks of unknown objects, so that they can be obtained in a simple environment. The RGB images of objects can be obtained by using these masks, and it is necessary to label the classes of objects through a human supervision. After obtaining object images, they are synthesized with various background images to create new images. Labeling the synthesized images is performed automatically using the masks and previously input object classes. In addition, human intervention is further reduced by using the robot arm to collect object images. The experiments show that the performance of instance segmentation trained through the proposed method is equivalent to that of the real dataset and that the time required to generate the dataset can be significantly reduced.

Keywords

References

S. Levine, P. Pastor, A. Krizhevsky, J. Ibarz, and D. Quillen, "Learning Hand-Eye Coordination for Robotic Grasping with Large-Scale Data Collection," The International Journal of Robotics Research (IJRR), vol. 37, no. 4-5, pp. 421-436, 2018. https://doi.org/10.1177/0278364917710318
J. Mahler, J. Liang, S. Niyaz, M. Laskey, R. Doan, X. Liu, J. A. Ojea, and K. Goldberg, "Dex-net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics," arXiv:1703.09312 [cs.RO], 2017.
A. Kuznetsova, H. Rom, N. Alldrin, J. Uijlings, I. Krasin, J. Pont-Tuset, S. Kamali, S. Popov, M. Malloci, T. Duerig, and V. Ferrari, "The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale," arXiv:1811.00982 [cs.CV], 2018.
M. Oquab, L. Bottou, I. Laptev, and J. Sivic, "Is object localization for free? -Weakly-supervised learning with convolutional neural networks," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, pp. 685-694, 2015.
A. Yao, J. Gall, C. Leistner, and L. V. Gool, "Interactive object detection," 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, pp. 3242-3249, 2012.
D. P. Papadopoulos, J. R. R. Uijlings, F. Keller, and V. Ferrari, "We don't need no bounding-boxes: Training object class detectors using only human verification," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 854-863, 2016.
Y. Wu, Y. Wu, G. Gkioxari, and Y. Tian, "Building generalizable agents with a realistic and rich 3D environment," arXiv:1801.02209 [cs.LG], 2018.
K. He, G. GkioXari, P. Dollar, and R. Girshick, "Mask R-CNN," 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, pp. 2980-2988, 2017.
J. Leitner, A. W. Tow, N. Sunderhauf, J. E. Dean, J. W. Durham, M. Cooper, M. Eich, C. Lehnert, R. Mangels, C. McCool, P. T. Kujala, L. Nicholson, T. Pham, J. Sergeant, L. Wu, F. Zhang, B. Upcroft, and P. Corke, "The ACRV picking benchmark: A robotic shelf picking benchmark to foster reproducible research," 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, Singapore, pp. 4705-4712, 2017.
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick, "Microsoft COCO: Common Objects in Context," European Conference on Computer Vision (ECCV), Zurich, Switzerland, pp.740-755, 2014.
M. Everingham, S. M. A. Eslami, L. V. Gool, C. K. I. Williams, J. Winn, and A. Zisserman, "The Pascal Visual Object Classes Challenge: A Retrospective," International Journal of Computer Vision (IJCV), vol. 111, no. 1, pp.98-116, 2015. https://doi.org/10.1007/s11263-014-0733-5

The Journal of Korea Robotics Society (로봇학회논문지)

Automatic Dataset Generation of Object Detection and Instance Segmentation using Mask R-CNN

Mask R-CNN을 이용한 물체인식 및 개체분할의 학습 데이터셋 자동 생성

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)