Informatics and Applications
2021, Volume 15, Issue 1, pp 42-49
VARIATIONAL DEEP LEARNING MODEL OPTIMIZATION WITH COMPLEXITY CONTROL
- O. S. Grebenkova
- O. Yu. Bakhteev
- V. V. Strijov
Abstract
This paper investigates the problem of deep learning model optimization. The authors propose a method
to control model complexity. The minimum description length is interpreted as the complexity of the model. It acts
as the minimal amount of information that is required to transfer information about the model and the dataset. The
proposed method is based on representation of a deep learning model. The authors propose a form of a hypernet
using the Bayesian inference. A hypernet is a model that generates parameters of an optimal model. The authors
introduce probabilistic assumptions about the distribution of parameters of the deep learning model. The paper
suggests maximizing the evidence lower bound of the Bayesian model validity. The authors consider the evidence
bound as a conditional value that depends on the required model complexity. The authors analyze this method in
computational experiments on the MNIST dataset.
[+] References (9)
- Graves, A. 2011. Practical variational inference for neural
networks. Advances in neural information processing systems.
Eds. J. Shawe-Taylor, R. Zemel, P. Barlett, et al. ACM.
24:2348-2356.
- Ha, D., A. M. Dai, and Q. V. Le. 2017. HyperNetworks.
29 p. Available at: https://arxiv.org/pdf/1609.09106.pdf
(accessed January 25, 2021).
- Kuznetsov, M. P., A. A. Tokmakova, and V. V. Strijov. 2016.
Analytic and stochastic methods of structure parameter
estimation. Informatica 27(3):607-624.
- Strijov, V. V., and O. Yu. Bakhteev. 2018. Deep learning
model selection of suboptimal complexity. Automat. Rem.
Contr. 79(8):1474-1488.
- Saxena, S., and J. Verbeek. 2016. Convolutional neural
fabrics. Advances in neural information processing systems. Eds. D. Lee, M. Sugiyama, U. Luxburg, et al. - ACM. 29:4053-4061.
- Xie, S., H. Zheng, C. Liu, and L. Lin. 2019. SNAS: Stochastic neural architecture search. 17 p. Available at: https://arxiv.org/abs/1812.09926 (accessed January 25, 2021).
- Wu, B., X. Dai, P. Zhang, Y. Wang, F. Sun, Y. Wu, Y. Tian, P. Vajda, Y. Jia., and K. Keutzer. 2019. FBNet: Hardware- aware efficient convnet design via differentiable neural architecture search. IEEE/CVFConference on Computer Vision and Pattern Recognition Proceedings. IEEE. 1:10726-10734
- Lorraine, J., and D. Duvenaud. 2018. Stochastic hyperparameter optimization through hypernetworks. 9 p. Available at: https://arxiv.org/pdf/1802.09419.pdf (accessed January 25, 2021).
- LeCun, Y., C. Cortes, and C. Burges. 1998. The MNIST dataset of handwritten digits. Available at: http://yann. lecun.com/exdb/mnist/ (accessed January 25, 2021).
[+] About this article
Title
VARIATIONAL DEEP LEARNING MODEL OPTIMIZATION WITH COMPLEXITY CONTROL
Journal
Informatics and Applications
2021, Volume 15, Issue 1, pp 42-49
Cover Date
2021-03-30
DOI
10.14357/19922264210106
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
model variational optimization; hypernets; deep learning; neural networks; Bayesian inference; model complexity control
Authors
O. S. Grebenkova , O. Yu. Bakhteev , , and V. V. Strijov ,
Author Affiliations
Moscow Institute of Physics and Technology, 9 Institutskiy Per., Dolgoprudny, Moscow Region 141700, Russian Federation
Antiplagiat Co., 42-1 Bolshoy Blvd., Moscow 121205, Russian Federation
A. A. Dorodnicyn Computing Center, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 40 Vavilov Str., Moscow 119333, Russian Federation
|