Informatics and Applications
2014, Volume 8, Issue 1, pp 89-98
INTEGRATED MODELING OF LANGUAGE STRUCTURES
FOR LINGUISTIC PROCESSORS OF KNOWLEDGE MANAGEMENT
AND MACHINE TRANSLATION SYSTEMS
Abstract
The paper is dedicated to research and integrated modeling of cognitive linguistic representations of
language structures and the mechanisms for resolution of syntactic ambiguity in the process of creating linguistic
processors for intelligent knowledge processing and machine translation systems. The technique of constructing
the hybrid cognitive linguistic model for representation of language structures and resolution of their ambiguity
on the basis of logical linguistic rules and vector spaces has been developed and specified. The method presented
is new and rests on the modern level of development in science and technology. The following tasks have been
carried out: comparative research of classification methods has been made with respect to linguistic problems;
the effective method of mapping the vector of natural language structures into the expanded space of attributes
for classification of new language objects and structures has been worked out; the focal sample of parallel texts
of business and scientific documents in Russian, English, and French has been developed; the expanded system
of new categories enhancing the representational power of the initial grammar variant has been formed; the
extended semantic networks were employed for unified representation of the matched language structures and the
experiments of vector spacesmethod application for resolution of syntactic ambiguity of the key language structures
were performed; the grammatical formalismand the algorithmic representation have been designed of the parser in
which the difficulties of translation including the language transformations are taken into account.
[+] References (31)
- Kozerenko, E.B. 2003. Cognitive approach to language
structure segmentation for machine translation algorithms.
Conference (International) on Machine Learning,
Models, Technologies and Applications Proceedings. Las
Vegas, USA: CSREA Press. 49 - 55.
- Kozerenko, E.B. 2013. Strategii vyravnivaniya parallel'nykh
tekstov: Semanticheskie aspekty [Parallel texts
alignment strategies: The semantic aspects]. Informatika i
ee primeneniya ¡ª Inform. Appl.] 7(1):82 - 89.
- Shaumyan, S. 2003. Categorial grammar and semiotic
universal grammar. IC-AI'03: Conference (International)
on Artificial Intelligence Proceedings. Las Vegas, USA:
CSREA Press. 623 - 629.
- Kuznetsov, I. P., E.B. Kozerenko, and A.G. Matskevich.
2011. Intelligent extraction of knowledge structures from
natural language texts. 2011 IEEE/WIC/ACM Joint Conferences
(International) on Web Intelligence and Intelligent
Agent Technology Proceedings. Washington, DC, USA:
IEEE Computer Society. 03:269 - 272. doi:10.1109/WIIAT.
2011.235
- Dempster, A. P., N.M. Laird, and D.B. Rubin. 1977.
Maximum likelihood from incomplete data via the EM
algorithm. J. Roy. Stat. Soc. Ser. B. 39(1):1 - 22.
- Lund, K., and C. Burgess. 1996. Producing highdimensional
semantic spaces from lexical co-occurrence.
Behav. Res. Meth. Instr. Comp. 28(2):203 - 208.
- Curran, J.R. 2004. From distributional to semantic
similarity. PhD Thesis. Edinburgh: University of Edinburgh.
177 p. Available at: http://sydney.edu.au/
engineering/it/¡«james/pubs/pdf/phdthesis.pdf.
- McCarthy, D., R. Koeling, J. Weeds, and J. Carroll. 2004.
Finding predominant senses in untagged text. 42nd Annual
Meeting of the Association for Computational Linguistics
Proceedings. Barcelona, Spain: ACL. 280 - 287.
- Clark, S., and S. Pulman. 2007. Combining symbolic
and distributional models of meaning. AAAI Spring
Symposium on Quantum Interaction Proceedings. Palo
Alto, CA, USA: AAAI Press. 52 - 55. Available at:
http://www.cl.cam.ac.uk/~sc609/pubs/aaai07.pdf.
- Morozova, Yu. 2013. Method for extracting translation
correspondences from a parallel corpus. ICAI'13,
WORLDCOMP'13 Proceedings. Las Vegas, USA: CSREA
Press. II:65 - 69.
- Danielson, D.A. 2003. Vectors and tensors in engineering
and physics. 2nd ed. Boulder, CO: Westview (Perseus).
282 p.
- Montague,R. 1970. Universal grammar. Theoria. 36:373 -
398. (Reprinted: Thomason, R.H., ed. 1974. Formal
philosophy: Selected papers of Richard Montague. New
Haven - London: Yale University Press. 222 - 246.)
- Partee, B. 2004. Compositionality. Compositionality in
formal semantics: Selected papers by Barbara H. Partee.
Malden, MA: Blackwell. 153 - 181. doi: 10.1002/
9780470751305.ch7.
- Pang, B., K. Knight, and D. Marcu. 2003. Syntax-based
alignment ofmultiple translations: Extracting paraphrases
and generating new sentences. NAACL'03: 2003 Conference
of the North American Chapter of the Association for
Computational Linguistics on Human Language Technology
Proceedings. Stroudsburg, PA, USA: ACL. 1:102 - 109.
- Bannard, C., and C. Callison-Burch. 2005. Paraphrasing
with bilingual parallel corpora. 43rd AnnualMeeting of the
ACL Proceedings. Stroudsburg, PA, USA: ACL. 597 - 604.
- Callison-Burch, C. 2008. Syntactic constraints on paraphrases
extracted from parallel corpora. EMNLP-2008
Proceedings. Stroudsburg, PA, USA: ACL. 196 - 205.
- Ganitkevitch, Ju., C. Callison-Burch, C. Napoles, and
B. Van Durme. 2011. Learning sentential paraphrases
from bilingual parallel corpora for text-to-text generation.
2011 Conference on Empirical Methods in Natural
Language Processing Proceedings. Stroudsburg, PA, USA:
ACL. 1168 - 1179.
- Bogatyrev, K. 2006. In defense of symbolic NLP. MLMTA'
06: Conference (International) on Machine Learning,
Models, Technologies and Applications Proceedings. Las
Vegas, USA: CSREA Press. 63 - 68.
- Malkov, K. V., and D. V. Tunitsky. 2006. On extreme principles
of machine learning in anomaly and vulnerability
assessment. MLMTA'06: Conference (International) on
Machine Learning, Models, Technologies and Applications
Proceedings. Las Vegas, USA: CSREA Press. 24 - 29.
- Ajdukiewicz,K. 1935. Die Syntaktische Konnexitat. Stud.
Philos. 1(1):1 - 27.
- Bar-Hillel, Y. 1953. A quasi-arithmetical notation for
syntactic description. Language 29(1):47 - 58. Available
at: http://ling.umd.edu//¡«alxndrw/CGReadings/bar-
hillel-53.pdf.
- Lambek, J. 1958. The mathematics of sentence structure.
Am. Math. Mon. 65(3):154 - 170.
- Steedman, M. 1996. Surface structure and interpretation.
Linguistic inquirymonographs.Massachusetts:MIT
Press. 140 p.
- Lambek, J. 2008. From word to sentence: A computational
algebraic approach to grammar. Monza, Italy: Polimetrica
Publ. 154 p.
- Moortgat, M. 2009. Symmetric categorial grammar.
J. Philos. Logic 38(6):681 - 710.
- Gazdar, G. 1996. Paradigm merger in natural language
processing. Computing tomorrow: Future research directions
in computer science. Eds. R. Milner and I. Wand.
Cambridge, U.K.: Cambridge University Press. 88 - 109.
- Clark, S., and J.R. Curran. 2007. Wide-coverage efficient
statistical parsing with CCG and log-linear models.
Comput. Linguist. 33(4):493–552.
- Baroni, M., and R. Zamparelli. 2010. Nouns are vectors,
adjectives arematrices:Representing adjective–nounconstructions
in semantic space. 2010 Conference on Empirical
Methods in Natural Language Processing Proceedings.
Stroudsburg, PA, USA: ACL. 1183–1193.
- Grefenstette, E., and M. Sadrzadeh. 2011. Experimental
support for a categorical compositional distributional
model of meaning. Conference on Empirical Methods in
Natural Language Processing Proceedings. Stroudsburg,
PA, USA: ACL. 1394–1404.
- Hermann, K.M., and P. Blunsom. 2013. The role of syntax
in vector space models of compositional semantics.
51st Annual Meeting of the Association for Computational
Linguistics Proceedings. Stroudsburg, PA, USA: ACL.
894–904.
- Church, K., and P. Hanks. 1996.Word association norms,
mutual information, and lexicography. Comput. Linguist.
16(1):22–29.
[+] About this article
Title
INTEGRATED MODELING OF LANGUAGE STRUCTURES
FOR LINGUISTIC PROCESSORS OF KNOWLEDGE MANAGEMENT
AND MACHINE TRANSLATION SYSTEMS
Journal
Informatics and Applications
2014, Volume 8, Issue 1, pp 89-98
Cover Date
2014-03-31
DOI
10.14357/19922264140109
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
relational database; context; lossless join; composite table
Authors
S. V. Zykin
Author Affiliations
Sobolev Institute o fMathematics, Siberian Branch of the Russian Academy of Sciences, 4 Acad. Koptyug Av.,
Novosibirsk 630090, Russian Federation
|