Systems and Means of Informatics
2016, Volume 26, Issue 4, pp 124-137
CROSS-LINGUAL DATABASE FOR ANNOTATING LOGICAL-SEMANTIC RELATIONS IN THE TEXT
- A. A. Durnovo
- I. M. Zatsman
- E. Yu. Loshchilova
Abstract
The problem of designing a cross-lingual database is described. The purpose of such a database is annotating logical-semantic relations between fragments of parallel texts in two or more languages. One of the objectives of its design is the information and computer provision of constructing by linguists a classification scheme of logical and semantic relationships not depending on the text language. A point of this design problem is that linguists annotate logical-semantic relations using a list of rubrics, which is being formed in the process of annotating by means of the cross-lingual database. According to its functions, the database can by classified as a supracorpora database. Its pilot version allowed linguists to form, at the same time, thousands of annotations and a list of rubrics to be used in the annotating process.
[+] References (16)
- Prasad, R., and H. Bunt. 2015. Semantic relations in discourse: The current state of ISO 24617-8. 11th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-11) Proceedings. Tilburg: Tilburg University. 80-92.
- Bunt, H., and R. Prasad. 2016. ISO-DR-Core (ISO 24617-8): Core concepts for the annotation of discourse relations. 12th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-12) Proceedings. Portoroz. 45-54. Available at: http://www.lrec-conf.org/proceedings/lrec2016/LREC2016_Proceedings.zip (accessed September 29, 2016).
- Marcus, M.P., B. Santorini, and M.A. Marcinkiewicz. 1993. Building a large annotated corpus of English: The Penn Treebank. Comput. Linguist. 19(2):313-330.
- Loiseau, S., D.V. Sitchinava, Anna A. Zalizniak, and I. M. Zatsman. 2013. Information technologies for creating the database of equivalent verbal forms in the Russian-French multivariant parallel corpus. Informatika i ee Primeneniya - Inform. Appl. 7(2): 100-109.
- Kruzhkov, M. G., N. V. Buntman, E. Ju. Loshchilova, D. V. Sitchinava, Anna A. Zalizniak, and I. M. Zatsman. 2014. A database of Russian verbal forms and their French translation equivalents. Komp'yuternaya lingvistika i intellektual'nye tekhnologii: Po mat-lam ezhegodnoy Mezhdunar. konf. "Dialog" [Computational Linguistics and Intellectual Technologies: Conference (International) "Dialog" Proceedings]. Moscow: RGGU. 13(20):284-296.
- Buntman, N. V., Anna A. Zaliznyak, I. M. Zatsman, M. G. Kruzhkov, E.Yu. Loshchilova, and D.V. Sichinava. 2014. Informatsionnye tekhnologii kor- pusnykh issledovaniy: Printsipy postroeniya kross-lingvisticheskikh baz dannykh [Information technologies for corpus studies: Underpinnings for cross-linguistics databases creation]. Informatika i ee Primeneniya - Inform. Appl. 8(2):98-110.
- Zatsman, I., and N. Buntman. 2015. Outlining goals for discovering new knowledge and computerised tracing of emerging meanings discovery. 16th European Conference on Knowledge Management Proceedings. Reading: Academic Publishing International Ltd. 851-860.
- Zaliznyak, Anna A., I. M. Zatsman, O.Yu. In'kova, and M. G. Kruzhkov. 2015. Nadkorpusnye bazy dannykh kak lingvisticheskiy resurs [Supracorpora databases as linguistic resource]. 7th Conference (International) on Corpus Linguistics Proceedings. St. Petersburg: SPbGU. 211-218.
- Prasad, R., N. Dinesh, A. Lee, E. Miltsakaki, L. Robaldo, A. Joshi, and B. Webber.
2008. The Penn Discourse TreeBank 2.0. 6th Conference (International) on Language Resources and Evaluation Proceedings. Eds. Calzolari, N., K. Choukri, B. Maegaard, J. Mariani, J. Odjik, S. Piperidis, and D. Tapias. Paris: European Language Resources Association. 2961-2968.
- Prasad, R., B. Webber, and A. Joshi. 2014. Reflections on the Penn Discourse Treebank, comparable corpora, and complementary annotation. Comput. Linguist. 40(4):921-950.
- Hoek, J., and S. Zufferey. 2015. Factors influencing the implicitation of discourse relations across languages. 11th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-11) Proceedings. Tilburg: Tilburg University. 80-92.
- Zufferey, S., and L. Degand. 2013. Annotating the meaning of discourse connectives in multilingual corpora. Corpus Linguist. Ling. 1:1-24. doi: 10.1515/cllt- 2013-0022. Available at: http://www.academia.edu/download/32477556/Zufferey- DegandCLLT_2013-0033.pdf (accessed September 24, 2016).
- Cimiano, P., S. Staab, and J. Tane. 2003. Automatic acquisition of taxonomies from text: FCA meets NLP. PKDD/ECML'03 Workshop (International) on Adaptive Text Extraction and Mining Proceedings. Dubrovnik. 10-17. Available at: http://staffwww.dcs.shef.ac.uk/people/F.Ciravegna/ATEM03/cimiano- ecml03-atem.pdf (accessed September 14, 2016).
- Cimiano, P., A. Hotho, and S. Staab. 2005. Learning concept hierarchies from text corpora using formal concept analysis. J. Artif. Intell. Res. 24(1):305-339.
- Kavalec, M., and V. Svatek. 2005. A study on automated relation labelling in ontology learning. Ontology learning and population. Eds. P. Buitelaar and Ph. Cimiano. Amsterdam: IOS Press. 44-58.
- Zatsman, I. M., O.Yu. In'kova, M. G. Kruzhkov, and N. A. Popkova. 2016. Predstavlenie krossyazykovykh znaniy o konnektorakh v nadkorpusnykh bazakh dannykh [Presentation of cross-lingual knowledge about connectors in supracorpora databases]. Informatika i ee Primeneniya - Inform. Appl. 10(1): 106-118.
[+] About this article
Title
CROSS-LINGUAL DATABASE FOR ANNOTATING LOGICAL-SEMANTIC RELATIONS IN THE TEXT
Journal
Systems and Means of Informatics
Volume 26, Issue 4, pp 124-137
Cover Date
2016-11-30
DOI
10.14357/08696527160411
Print ISSN
0869-6527
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
cross-lingual databases; annotating; parallel texts; corpus linguistics; logical-semantic relations
Authors
A. A. Durnovo , I. M. Zatsman , and E. Yu. Loshchilova
Author Affiliations
Institute of Informatics Problems, Federal Research Center "Computer Science
and Control", Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
|