Informatics and Applications
2020, Volume 14, Issue 2, pp 58-65
ORDERING THE SET OF NEURAL NETWORK PARAMETERS
- A. V. Grabovoy
- O. Yu. Bakhteev
- V. V. Strijov
Abstract
This paper investigates a method for setting order on a set of the model parameters. It considers linear models and neural networks. The set is ordered by the covariance matrix of the gradients. It is proposed to use a given order to freeze the model parameters during the optimization procedure. It is assumed that, after few iterations of the optimization algorithm, most of the model parameters can be frozen without significant loss of the model quality. It reduces the dimensionality of the optimization problem. This method is analyzed in the computational experiment on the real data. The proposed order is compared with the random order on the set of the model parameters.
[+] References (12)
- Sutskever, I., O. Vinyals, and Q. Le. 2014. Sequence to sequence learning with neural networks. Adv. Neur. Inf. 2:3104-3112.
- Li, C., C. Chen, D. Carlson, and L. Carin. 2016. Pre-conditioned stochastic gradient Langevin dynamics for deep neural networks. 13th AAAI Conference on Artificial Intelligence Proceedings. Phoenix, AZ. 1788-1794.
- Tibshirani, R. 1998. Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. 58:267-288.
- Zou, H., andT. Hastie. 2005. Regularization and variable selection via the Elastic Net. J. R. Stat. Soc. 67:301-320.
- Srivastava, N., G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15:1929-1958.
- Molchanov, D., A. Ashukha, and D. Vetrov. 2017. Variational dropout sparsifies deep neural networks. 34th Conference (International) on Machine Learning 70:2498- 2507.
- LeCun, Y., J. Denker, and S. Solla. 1989. Optimal brain damage. Adv. Neur. Inf. 2:598-605.
- Grabovoy, A.V., O.Yu. Bakhteev, andV.V. Strijov. 2019. Opredelenie relevantnosti parametrov neyroseti [Estimation of the relevance of the neural network parameters]. Informatika i ee Primeneniya - Inform. Appl. 13(2):62-70.
- Mandt, S., M. Homan, and D. Blei. 2017. Stochastic gradient descent as approximate Bayesian inference. J. Mach. Learn. Res. 18:1-35.
- Kingma, D., and L. Ba. 2015. Adam: A method for stochastic optimization. 3rd Conference (International) on Learning Representations. Available at: https://hdl. handle.net/11245/1.505367 (accessed May 26, 2020).
- LeCun, Y., C. Cortes, and C. Burges. 1998. The MNIST dataset of handwritten digits. Available at: http://yann. lecun.com/exdb/mnist/index.html (accessed May 26, 2020).
- Harrison, D., and D. Rubinfeld. 1991. Hedonic prices and the demand for clean air. J. Environ. Econ. Manag. 5:81-102.
[+] About this article
Title
ORDERING THE SET OF NEURAL NETWORK PARAMETERS
Journal
Informatics and Applications
2020, Volume 14, Issue 2, pp 58-65
Cover Date
2020-06-30
DOI
10.14357/19922264200208
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
sample approximation; linear model; neural network; model selection; error function
Authors
A.V. Grabovoy , O. Yu. Bakhteev , and V. V. Strijov ,
Author Affiliations
Moscow Institute of Physics and Technology, 9 Institutskiy Per., Dolgoprudny, Moscow Region 141700, Russian Federation
A. A. Dorodnicyn Computing Center, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 40 Vavilov Str., Moscow 119333, Russian Federation
|