Informatics and Applications
2019, Volume 13, Issue 2, pp 62-70
ESTIMATION OF THE RELEVANCE OF THE NEURAL NETWORK PARAMETERS
- A. V Grabovoy
- O. Yu. Bakhteev
- V. V. Strijov
Abstract
The paper investigates a method for optimizing the structure of a neural network. It is assumed that the number of neural network parameters can be reduced without significant loss of quality and without significant increase in the variance of the loss function. The paper proposes a method for automatic estimation of the relevance of parameters to prune a neural network. This method analyzes the covariance matrix of the posteriori distribution of the model parameters and removes the least relevant and multicorrelate parameters. It uses the Belsly method to search for multicorrelation in the neural network. The proposed method was tested on the Boston Housing data set, the Wine data set, and synthetic data.
[+] References (12)
- Sutskever, I., O. Vinyals, and Q. Le. 2014. Sequence to sequence learning with neural networks. Adv. Neur. Inf. 2:3104-3112.
- Maclaurin, D., D. Duvenaud, and R. Adams. 2015. Gradient-based hyperparameter optimization through reversible learning. 32th Conference (International) on Machine Learning Proceedings. Lille. 37:2113-2122.
- Luketina, J., M. Berglund, T Raiko, and K. Greff. 2016. Scalable gradient-based tuning of continuous regularization hyperparameters. 33th Conference (International) on Machine Learning Proceedings. New York, NY. 48:29522960.
- Molchanov, D., A. Ashukha, and D. Vetrov. 2017. Variational dropout sparsifies deep neural networks. 34th Conference (International) on Machine Learning Proceedings. Sydney. 70:2498-2507.
- Neal, A., and M. Radford. 1995. Bayesian learning for neural networks. Toronto, ON: University of Toronto. Ph.D. Thesis. 195 p.
- LeCun, Y., J. Denker, and S. Solla. 1989. Optimal brain damage. Adv. Neur. Inf. 2:598-605.
- Graves, A. 2011. Practical variational inference for neural networks. Adv. Neur. Inf. 24:2348-2356.
- Louizos, C., K. Ullrich, and M. Welling. 2017. Bayesian compression for deep learning. Adv. Neur. Inf. 30:32883298.
- Neychev, R., A. Katrutsa, and V. Strijov. 2016. Robust selection of multicollinear features in forecasting. Factory Laboratory 82(3):68-74.
- Harrison, D., andD. Rubinfeld. 1978. Hedonic prices and the demand for clean air. J. Environ. Econ. Manag. 5:81- 102. Available at: https://www.cs.toronto.edu/~delve/ data/boston/bostonDetail.html (accessed June 4, 2019).
- Aeberhard, S. 1991. Wine Data Set. Available at: http://archive.ics.uci.edu/ml/datasets/Wine (accessed June 4, 2019).
- Bishop, C. 2006. Pattern recognition and machine learning. Berlin: Springer. 758 p.
[+] About this article
Title
ESTIMATION OF THE RELEVANCE OF THE NEURAL NETWORK PARAMETERS
Journal
Informatics and Applications
2019, Volume 13, Issue 2, pp 62-70
Cover Date
2019-06-30
DOI
10.14357/19922264190209
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
neural network; hyperparameters optimization; Belsly method; relevance of parameters; neural network pruning
Authors
A. V Grabovoy , O. Yu. Bakhteev , and V. V. Strijov ,
Author Affiliations
Moscow Institute of Physics and Technology, 9 Institutskiy Per., Dolgoprudny, Moscow Region 141700, Russian Federation
A. A. Dorodnicyn Computing Center, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 40 Vavilov Str., Moscow 119333, Russian Federation
|