Systems and Means of Informatics
2018, Volume 28, Issue 2, pp 145-153
INTELLECTUAL ANALYSIS OF DATA ON THE BASIS OF STANFORD CoreNLP FOR POS TAGGING OF TEXTS IN THE RUSSIAN LANGUAGE
- O. V. Andreeva
- M. B. Bagirov
- A. A. Dankina
- T. O. Fedorova
- M. M. Sheveleva
Abstract
The basic principles of Stanford CoreNLP and the implementation of this library in various natural languages are discussed. Different ways of Stanford CoreNLP interaction with texts in Russian have been developed. A model that makes it possible to determine the parts of speech in the texts in Russian has been created, the quality of the model's performance on the texts of technical literature in Russian has been increased. The tests that show the effectiveness of the implemented changes are presented.
[+] References (5)
- Manning, Ch. D., M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. The Stanford CoreNLP natural language processing toolkit. Available at: https:// nlp.stanford.edu/pubs/StanfordCoreNlp2014.pdf (accessed March 16, 2018).
- Marti, M. A., M. Taule, M. Bertran, and L. Marquez. AnCora: Multilingual and multilevel annotated corpora. Available at: http://clic.ub.edu/corpus/webfm_send/13 (accessed March 16, 2018).
- Arun, A., andF. Keller. 2005. Lexicalization in crosslinguistic probabilistic parsing: The case of French. 43rd Annula Meeting of the Association for Computational Linguistics. Ann Arbor, MI. 306-313. Available at: http://homepages.inf.ed.ac.uk/keller/papers/ acl05.pdf (accessed March 16, 2018).
- Nivre, J., M.- C. Marneffe, F. Ginter, Yo. Goldberg, and J. Raji. Universal dependencies v1: A multilingual treebank collection. Available at: https://nlp.stanford.edu/ pubs/nivre2016ud.pdf (accessed March 16, 2018).
- Artemov, M. A., A. N. Vladimirov, and K. E. Seleznyov. 2013. Obzor sistem analiza estestvennogo teksta na russkom yazyke [Survey of natural text analysis systems in Russian]. Vestnik VSU. Ser. Sistemnyy analiz i informatsionnye tekhnologii [Proceedings of VSU. Ser. Systems Analysis and Information Technologies] 2:189-194.
[+] About this article
Title
INTELLECTUAL ANALYSIS OF DATA ON THE BASIS OF STANFORD CoreNLP FOR POS TAGGING OF TEXTS IN THE RUSSIAN LANGUAGE
Journal
Systems and Means of Informatics
Volume 28, Issue 2, pp 145-153
Cover Date
2018-05-30
DOI
10.14357/08696527180211
Print ISSN
0869-6527
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
data processing; intellectual data analysis; Stanford CoreNLP; natural language analysis; POS tagger; definition of parts of speech; morphological analysis of texts in the Russian language
Authors
O. V. Andreeva , M. B. Bagirov , A. A. Dankina , T. O. Fedorova , and M. M. Sheveleva
Author Affiliations
R. E. Alekseev Nizhny Novgorod State Technical University; 24-1 Minin Str., Nizhny Novgorod 603000, Russian Federation
|