Sutskever, I., O. Vinyals, and Q.V. Le. 2014. Sequence to sequence learning with neural networks. Adv. Neur. In. 27:3104-3112. Available at: https://papers.nips.cc/ paper/5346-sequence-to-sequence-learning-with-neural- networks.pdf (accessed December 29, 2017).
Bishop, C. M. 2006. Pattern recognition and machine learning. Springer. 758 p.
Kuznetsov, M. P., A. A. Tokmakova, and V. V. Strijov. 2016. Analytic and stochastic methods of structure parameter estimation. Informatica 27(3):607-624.
Popova, M. S., and V. V. Strijov. 2015. Vybor optimal'noy modeli klassifikatsii fizicheskoy aktivnosti po izmereniyam akselerometra [Selection of optimal physical activity classification model using measurements of accelerometer]. Informatika i ee Primeneniya - Inform. Appl. 9(1):76-86.
Sanborn,A., and J. Skryzalin. 2015. Deep learning for semantic similarity. Deep learning for natural language processing. Stanford, CA: Stanford University. CS224d:1-7. Available at: https://cs224d. stanford.edu/reports/SanbornAdrian.pdf (accessed December 29, 2017).
Pennington, J., R. Socher, and C. D. Manning. 2014. GloVe: Global vectors for word representation. Conference on Empiricial Methods in Natural Language Processing Proceedings. 12:1532-1543. https://nlp.stanford. edu/pubs/glove.pdf (accessed December 29, 2017).
Rong, X. 2014. Word2vec parameter learning explained. Arxiv. Available at: https://arxiv.org/abs/1411.2738 (accessed December 29, 2017).
Shi, T, and Z. Liu. 2014. Linking GloVe with word2vec. Arxiv. Available at: http://arxiv.org/abs/1411.5595 (accessed December 29, 2017).
Zolotov, V., and D. Kung. 2017. Analysis and optimization of fastText linear text classifier. Arxiv. Available at: https://arxiv.org/ftp/arxiv/papers/1702/1702.05531. pdf (accessed December 29, 2017).
Graves, A. 2011. Practical variational inference for neural networks. Adv. Neur. In. 24:2348-2356. Avail-able at: http://papers.nips.cc/paper/4329-practical- variational-inference-for-neural-networks.pdf (accessed December29, 2017).
Le Cun, Y., J. S. Denker., and S. A. Solla. 1989. Optimal brain damage. Adv. Neur. In. 2:598-605. Avail- ableat: https://papers.nips.cc/paper/250-optimal-brain- damage.pdf (accessed December 29, 2017).
Hassibi, B., D.G. Stork, and G.J. Wolff. 1992. Opti-mal brain surgeon and general network pruning. Neural Comput. 4:1-8.
Dataset of sentences with different types of similarity. Available at: http://alt.qcri.org/semeval2015/task2/ index.php?id=data-and-tools (accessed December 29, 2017).
GloVe Python library. Available at: https://github.com/ stanfordnlp/GloVe (accessed December 29, 2017).
Smerdov, A. N. Computational experiment code. Available at: https://sourceforge.net/p/mlalgorithms/code/ HEAD/tree/Group474/ Smerdov2017Paraphrase/code/ (accessed December 29, 2017).

Title

OPTIMAL RECURRENT NEURAL NETWORK MODEL IN PARAPHRASE DETECTION

Journal

Informatics and Applications
2018, Volume 12, Issue 4, pp 63-69

Cover Date

2018-12-30

DOI

10.14357/19922264180409

Print ISSN

1992-2264

Publisher

Institute of Informatics Problems, Russian Academy of Sciences

Additional Links

Key words

deep learning; recurrent neural network; neural network pruning; variational approach

Authors

A.N. Smerdov

, O.Y. Bakhteev

, and V.V. Strijov

Author Affiliations

Moscow Institute of Physics and Technology, 9 Institutskiy Per., Dolgoprudny, Moscow Region 141700, Russian Federation

A. A. Dorodnicyn Computing Center, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 40 Vavilov Str., Moscow 119333, Russian Federation

Informatics and Applications