Systems and Means of Informatics

2016, Volume 26, Issue 2, pp 4-22

SYSTEMS AND MEANS OF DEEP LEARNING FOR CLASSIFICATION PROBLEMS

O. Yu. Bakhteev
M. S. Popova
V. V. Strijov

Abstract

The paper provides a guidance on deep learning net construction and optimization using graphics processing unit. The paper proposes to use GPU- instances on the cloud platform Amazon Web Services. The problem of time series classification is considered. The paper proposes to use a deep learning net, i.e., a multilevel superposition of models, belonging to the following classes: restricted Boltzman machines, autoencoders, and neural nets with softmax- function in output. The proposed method was tested on a dataset containing time segments from mobile phone accelerometer. The analysis of relation between classification error, dataset size, and superposition parameter amount has been conducted.

[+] References (36)

Cho, K. 2014. Foundations and advances in deep learning. DSc. Espoo: Aalto University. 277 p.
Langkvist, M., L. Karlsson, and A. Loutfi. 2014. A review of unsupervised feature learning for time-series modelling. Pattern Recogn. Lett. 42(1): 11-24.
Desell, T., S. Clachar, J. Higgins, and B. Wild. 2015. Evolving deep recurrent neural networks using ant colony optimization. Evolutionary computation in combinatorial optimization. Eds. G. Ochoa, and F. Chicano. Lecture notes in computer science ser. Springer. 9026:86-98.
Popova, M.S., and V.V. Strijov. 2015. Postroeniye neironnykh setey glubukogo obucheniya dlya klassifikatsii vremennykh ryadov [Building superposition of deep learning neural networks for solving the problem of time series classification]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 25(3):60-77.
Wager, S., S. Wang, and P. Liang. 2013. Dropout training as adaptive regularization. Adv. Neur. In. 26:351-359.
Wan, L., M. Zeiler, S. Zhang, Y. LeCun, and R. Fergus. 2013. Regularization of neural networks using dropconnect. 30th Conference (International) on Machine Learning Proceedings. 1058-1066.
Srivastava, N., G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15:1929-1958.
Gal, Y., and Z. Ghahramani. 2015. Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. arXiv preprint arXiv:1506.02142. Available at: http://arxiv.org/abs/1506.02142 (accessed November 25, 2015).
Goodfellow, I.J., Q.V. Le, A.M. Saxe, H. Lee, and A. Y. Ng. 2009. Measuring invariances in deep networks. Adv. Neur. In. 22:646-654.
Szegedy, C., W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, andR. Fergus. 2014. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199. Available at: http://arxiv.org/abs/1312.6199 (accessed November 25, 2015).
Raiko, T., H. Valpola, and Y. LeCun. 2012. Deep learning made easier by linear transformations in perceptrons. J. Mach. Learn. Res. 22:924-932.
Bengio, Y., E. Laufer, G. Alain, and G. Yosinski. 2014. Deep generative stochastic networks trainable by backprop. 31st Conference (International) on Machine Learning Proceedings. 226-234.
Ioffe, S., andC. Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167. Available at: http://arxiv.org/abs/1502.03167 (accessed November 25, 2015).
Li, Z., C. Chang, F. Liang, T. S. Huang, C. Cao, and J.R. Smith. 2013. Learning locally-adaptive decision functions for person verification. IEEE Conference on Computer Vision and Pattern Recognition Proceedings. 3610-3617.
Sutskever, I., G. Hinton, and G. Taylor. 2009. The recurrent temporal restricted Boltzmann machine. Adv. Neur. In. 21:1601-1608.
Fischer, A., and C. Igel. 2014. Training restricted Boltzmann machines: An introduction. Pattern Recogn. 47:25-39.
Socher, R., E.H. Huang, J. Pennington, A.Y. Ng, and C. D. Manning. 2011. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. Adv. Neur. In. 24:801-809.
Shu, M., and F. Fyshe. 2013. Sparse autoencoders for word decoding from magne- toencephalography. 3rd NIPS Workshop on Machine Learning and Interpretation in NeuroImaging Proceedings. Available at: http://www.cs. cmu.edu/~afyshe/papers/ SparseAE.pdf (accessed November 25, 2015).
Vincent, P., H. Larochelle, I. Lajoie, Y. Bengio, and P. Manzagol. 2013. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11:3371-3408.
Kwapisz, J.R., G. M. Weiss, and S. Moore. 2010. Activity recognition using cell phone accelerometers. SIGKDD Explorations 12(2):74-82.
Bergstra, J., O. Breuleux, F. Bastien, P. Lamblin, R. Pascanu, G. Desjardins, J. Turian, D. Warde-Farley, and Y. Bengio. 2010. Theano: A CPU and GPU math expression compiler. Conference on Python for Scientific Computing Proceedings. 3-11.
Bastien, F., P. Lamblin, R. Pascanu, J. Bergstra, I. Goodfellow, A. Bergeron, N. Bouchard, D. Warde-Farley, and Y. Bengio. 2012. Theano: New features and speed improvements. Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop. Available at: http://arxiv.org/pdf/1211.5590v1.pdf (accessedNovember 25, 2015).
Goodfellow, I. J., D. Warde-Farley, P. Lamblin, V. Dumoulin, M. Mirza, R. Pascanu, J. Bergstra, F. Bastien, and Y. Bengio. 2013. Pylearn2: A machine learning research library. arXivpreprint arXiv:1308.4214. Available at: http://arxiv.org/abs/1308.4214 (accessed November 25, 2015).
Dieleman, S., J. Schluter, C. Raffel, et al. 2015. Lasagne: First release. August 13, 2015. Available at: http://dx.doi.org/10.5281/zenodo.27878 (accessed November 25, 2015).
Nickolls, J., I. Buck, M. Garland, and K. Skadron. 2008. Scalable parallel programming with CUDA. ACM Queue 6(2):40-53.
Stone, J. E., D. Gohara, and G. Shi. 2010. OpenCL: A parallel programming standard for heterogeneous computing systems. J. IEEE Design Test 12(10):66-73.
Erhan, D., Y. Bengio, A. Courville, P. Manzagol, and P. Vincent. 2010. Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11:625-660.
Duan, K., S. S. Keerthi, W. Chu, S. K. Shevade, and A.N. Poo. 2003. Multicategory classification by soft-max combination of binary classifiers. 4th Workshop (International) on Multiple Classifier Systems. 125-134.
Cho, K., T. Raiko, and A. Ilin. 2013. Gaussian-Bernoulli deep Boltzmann machine. 2013 Joint Conference (International) on Neural Networks. 1-7.
Hinton, G. E., S. Osindero, and Y. Teh. 2006. A fast learning algorithm for deep belief nets. Neural Comput. 18:1527-1554.
AWS Management Console. Available at: https://us-west-2.console.aws.amazon. com/console/ (accessed November25, 2015).
Stack overflow question: Do you get charged for a 'stopped' instance on EC2? Available at: http://stackoverflow.com/questions/2549035/do-you-get-charged-for- a-stopped-instance-on-ec2 (accessed November 25, 2015).
Bakhteev, О. Yu. 2016. Deep learning software. Available at: https://svn.code.sf.net/ p/mlalgorithms/code/Group074/Bakhteev2015TheanoCuda/code/ (accessed November 25, 2015).
Popova, M. S. 2015. Deep learning software. Available at: https://svn.code.sf.net/p/ mlalgorithms/code/Group174/Popova2015DeepLearning/ (accessed November 25, 2015).
Bishop, C. M. 2006. Pattern recognition and machine learning. New York, NY: Springer-Verlag. 738 p.
Cassioli, A., D. D. Lorenzo, M. Locatelli, F. Schoen and Sciandrone, M. 2012. Machine learning for global optimization. Comput. Optim. Appl. 1(1):279-303.

[+] About this article