Informatics and Applications
2018, Volume 12, Issue 4, pp 63-69
OPTIMAL RECURRENT NEURAL NETWORK MODEL IN PARAPHRASE DETECTION
- A.N. Smerdov
- O.Y. Bakhteev
- V.V. Strijov
Abstract
This paper addresses the problem of optimal recurrent neural network selection. It asserts the neural network evidence lower bound as the optimal criterion for selection. It investigates variational inference methods to approximate the posterior distribution of the network parameters. As a particular case, the normal distribution of the parameters with different types of the covariance matrix is investigated. The authors propose a method of pruning parameters with the highest probability density in zero to increase the model marginal likelihood. As an illustrative example, a computational experiment of multiclass classification on the SemEval 2015 dataset was carried out.
[+] References (15)
- Sutskever, I., O. Vinyals, and Q.V. Le. 2014. Sequence to sequence learning with neural networks. Adv. Neur. In. 27:3104-3112. Available at: https://papers.nips.cc/ paper/5346-sequence-to-sequence-learning-with-neural- networks.pdf (accessed December 29, 2017).
- Bishop, C. M. 2006. Pattern recognition and machine learning. Springer. 758 p.
- Kuznetsov, M. P., A. A. Tokmakova, and V. V. Strijov. 2016. Analytic and stochastic methods of structure parameter estimation. Informatica 27(3):607-624.
- Popova, M. S., and V. V. Strijov. 2015. Vybor optimal'noy modeli klassifikatsii fizicheskoy aktivnosti po izmereniyam akselerometra [Selection of optimal physical activity classification model using measurements of accelerometer]. Informatika i ee Primeneniya - Inform. Appl. 9(1):76-86.
- Sanborn,A., and J. Skryzalin. 2015. Deep learning for semantic similarity. Deep learning for natural language processing. Stanford, CA: Stanford University. CS224d:1-7. Available at: https://cs224d. stanford.edu/reports/SanbornAdrian.pdf (accessed December 29, 2017).
- Pennington, J., R. Socher, and C. D. Manning. 2014. GloVe: Global vectors for word representation. Conference on Empiricial Methods in Natural Language Processing Proceedings. 12:1532-1543. https://nlp.stanford. edu/pubs/glove.pdf (accessed December 29, 2017).
- Rong, X. 2014. Word2vec parameter learning explained. Arxiv. Available at: https://arxiv.org/abs/1411.2738 (accessed December 29, 2017).
- Shi, T, and Z. Liu. 2014. Linking GloVe with word2vec. Arxiv. Available at: http://arxiv.org/abs/1411.5595 (accessed December 29, 2017).
- Zolotov, V., and D. Kung. 2017. Analysis and optimization of fastText linear text classifier. Arxiv. Available at: https://arxiv.org/ftp/arxiv/papers/1702/1702.05531. pdf (accessed December 29, 2017).
- Graves, A. 2011. Practical variational inference for neural networks. Adv. Neur. In. 24:2348-2356. Avail-able at: http://papers.nips.cc/paper/4329-practical- variational-inference-for-neural-networks.pdf (accessed December29, 2017).
- Le Cun, Y., J. S. Denker., and S. A. Solla. 1989. Optimal brain damage. Adv. Neur. In. 2:598-605. Avail- ableat: https://papers.nips.cc/paper/250-optimal-brain- damage.pdf (accessed December 29, 2017).
- Hassibi, B., D.G. Stork, and G.J. Wolff. 1992. Opti-mal brain surgeon and general network pruning. Neural Comput. 4:1-8.
- Dataset of sentences with different types of similarity. Available at: http://alt.qcri.org/semeval2015/task2/ index.php?id=data-and-tools (accessed December 29, 2017).
- GloVe Python library. Available at: https://github.com/ stanfordnlp/GloVe (accessed December 29, 2017).
- Smerdov, A. N. Computational experiment code. Available at: https://sourceforge.net/p/mlalgorithms/code/ HEAD/tree/Group474/ Smerdov2017Paraphrase/code/ (accessed December 29, 2017).
[+] About this article
Title
OPTIMAL RECURRENT NEURAL NETWORK MODEL IN PARAPHRASE DETECTION
Journal
Informatics and Applications
2018, Volume 12, Issue 4, pp 63-69
Cover Date
2018-12-30
DOI
10.14357/19922264180409
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
deep learning; recurrent neural network; neural network pruning; variational approach
Authors
A.N. Smerdov , O.Y. Bakhteev ,
and V.V. Strijov ,
Author Affiliations
Moscow Institute of Physics and Technology, 9 Institutskiy Per., Dolgoprudny, Moscow Region 141700, Russian Federation
A. A. Dorodnicyn Computing Center, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 40 Vavilov Str., Moscow 119333, Russian Federation
|