DOI QR코드

DOI QR Code

New Approach to Optimize the Size of Convolution Mask in Convolutional Neural Networks

  • Kwak, Young-Tae (Dept. of IT and Engineering, Chonbuk National University)
  • Received : 2015.11.03
  • Accepted : 2015.12.30
  • Published : 2016.01.30

Abstract

Convolutional neural network (CNN) consists of a few pairs of both convolution layer and subsampling layer. Thus it has more hidden layers than multi-layer perceptron. With the increased layers, the size of convolution mask ultimately determines the total number of weights in CNN because the mask is shared among input images. It also is an important learning factor which makes or breaks CNN's learning. Therefore, this paper proposes the best method to choose the convolution size and the number of layers for learning CNN successfully. Through our face recognition with vast learning examples, we found that the best size of convolution mask is 5 by 5 and 7 by 7, regardless of the number of layers. In addition, the CNN with two pairs of both convolution and subsampling layer is found to make the best performance as if the multi-layer perceptron having two hidden layers does.

Keywords

References

  1. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning internal representations by error propagation," in Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Cambridge, MA: Bradford Books, vol. I, pp. 318-362, 1986.
  2. S. E. Fahlman and C. Lebiere, "The cascade correlation learning architecture," Neural Information Processing System 2, D. S. Touretzsky, ed. Morgan Kaufman, pp. 524-532, 1990.
  3. M. Riedmiller and H. Braun, "A direct adaptive method of faster backpropagation learning: The RPROP algorithm," in Proc. IEEE Int. Conf. Neural Netw., San Francisco, CA, pp. 586-591, 1993.
  4. E. K. P. Chong and S. H. Zak, An Introduction to Optimization. New York: Wiley, 1996.
  5. M. T. Hagan and M. B. Menhaj, "Training feedforward networks with the Marquardt algorithm," IEEE Trans. Neural Netw., vol. 5, no. 6, pp. 989-993, Nov. 1994. https://doi.org/10.1109/72.329697
  6. Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard and L. D. Jackel, "Handwritten digit recognition with a back-propagation network," in Touretzky, David (Eds), Advances in Neural Information Processing Systems (NIPS 1989), 2, Morgan Kaufman, Denver, CO. 1990.
  7. Lawrence, S., Giles, C.L., Ah Chung Tsoi, Back, A.D., "Face recognition: a convolutional neural-network approach," IEEE Trans. Neural Netw., vol. 8, no. 1, pp. 98-113, Jan 1997. https://doi.org/10.1109/72.554195
  8. LeCun, Yann, Koray Kavukcuoglu, and Clement Farabet. "Convolutional networks and applications in vision." Circuits and Systems (ISCAS), Proceedings of 2010 IEEE International Symposium on. IEEE, 2010.
  9. J. Villiers and E. Barnard, "Backpropagation Neural Nets with One and Two Hidden Layers," IEEE Trans. Neural Netwoks, vol. 4, no. 1, pp. 136-141, 1993. https://doi.org/10.1109/72.182704
  10. S. L. Phung and A. Bouzerdoum, "MATLAB library for convolutional neural network," Technical Report, ICT Research Institute, Visual and Audio Signal Processing Laboratory, University of Wollongong. Available at: http://www.uow.edu.au/˜phung.
  11. S. L. Phung, A. Bouzerdoum, and D. Chai, "Skin segmentation using color pixel classification: analysis and comparison," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 1, pp. 148-154, 2005. https://doi.org/10.1109/TPAMI.2005.17