DOI QR코드

DOI QR Code

Comparison of Weight Initialization Techniques for Deep Neural Networks

  • Kang, Min-Jae (Department of Electronic Engineering, Jeju National University) ;
  • Kim, Ho-Chan (Department of Electronic Engineering, Jeju National University)
  • Received : 2019.10.08
  • Accepted : 2019.11.08
  • Published : 2019.12.31

Abstract

Neural networks have been reborn as a Deep Learning thanks to big data, improved processor, and some modification of training methods. Neural networks used to initialize weights in a stupid way, and to choose wrong type activation functions of non-linearity. Weight initialization contributes as a significant factor on the final quality of a network as well as its convergence rate. This paper discusses different approaches to weight initialization. MNIST dataset is used for experiments for comparing their results to find out the best technique that can be employed to achieve higher accuracy in relatively lower duration.

Keywords

References

  1. Smith, Craig S, "The Man Who Helped Turn Toronto into a High-Tech Hotbed," The New York Times. Retrieved 27 June 2017.
  2. Yann LeCun1,2, Yoshua Bengio3 & Geoffrey Hinton, "Deep learning," Nature volume521, pages436-444 (28 May 2015). https://doi.org/10.1038/nature14539
  3. Xavier Glorot, Yoshua Bengio, "Understanding the difficulty of training deep feedforward neural networks," Proceedings of the 13th International Conf. on Artificial Intelligence and Statistics, Sardinia, Italy, 2010.
  4. Kaiming He, Xaiangyu Zhang, Shaoqing Ren, and Jian Sun, "Developing Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification," Proceedings of the 2015 IEEE International Conf. on Computer Vision, Santiago, Chile, 2015.
  5. Serge Ioffe and Christian Szegedy, "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift," Proceedings of the 32nd International Conf. on Machine Learning, Lille, France, 2015.
  6. LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P., "Gradient based learning applied to document recognition," Proceedings of the IEEE, 86(11):2278-2324, November 1998. https://doi.org/10.1109/5.726791