Informatics and Applications
2014, Volume 8, Issue 2, pp 130-144
UNIVERSAL TECHNOLOGY OF INFORMATION OBJECTS PROXIMITY ASSESSMENT
Abstract
The paper outlines the technology used to determine the degree of similarity of information objects,
which are represented by text or graphic images. Objects are formalized by probabilistic models. The structure of
the model is set by an algebra on a minimum set of graphic components of an object. Quantitative characteristics
of the structure of objects are the probability distributions on the algebra. The amount of information in objects is
estimated by entropy. The similarity measure of information objects is based on entropy. The paper describes the
method of estimating the proximity of text and graphic objects. The paper provides several examples of estimation
algorithms implementation. It is shown that the developed method is more efficient compared to the methods
described in the literature. The technology used to form images of information objects and to compare their
semantic content is universal. It is possible to adapt the technology to the meaningful characteristics of objects
being analyzed.
[+] References (14)
- Manning, Ch.D., P. Raghavan, and H. Sch.utz. 2009. An
introduction to information retrieval.Cambridge:University
Press. 569 p.
- Salton, G., A.Wong, and C. S. Yang. 1975. A vector space
model for automatic indexing. Comm. ACM. 11:613–620.
- Kuznetsov, L.A. 2011. Veroyatnostno-statisticheskaya
otsenka adekvatnosti informatsionnykh ob”ektov [Probabilistic
and statistical evaluation of the adequacy of information
objects]. Informatika i ee Primeneniya — Inform.
Appl. 5(4):39–50.
- Kuznetsov, L.A., V. F. Kuznetsova, and D. I. Antonov.
2013. Otsenka blizosti graficheskikh ob”ektov na primere
elektricheskikh skhem s pomoshch’yu informatsionnogo
kriteriya [Estimation of the distance graphical objects on
the example of electrical circuits using information criterion].
Otkrytoe i Distantsionnoe Obrazovanie [Open and
Distance Education] 2:35–43.
- Kuznetsov, L.A., and D.A. Bugakov. 2013. Razrabotka
mery otsenki informatsionnogo rasstoyaniya mezhdu
graficheskimi ob”ektami [Development of measures
assessing the information distance between graphic objects].
Informatsionno-Upravlyayushchie Sistemy [Information
and Control Systems] 1:74–79.
- Gnedenko, B. V. 2007. Kurs teorii veroyatnostey [Course
of probability theory]. Moscow: LKI Publs. 448 p.
- GOST 2.743-91 ESKD. Oboznacheniya uslovnye graficheskie
v skhemakh. Elementy tsifrovoy tekhniki [State
Standard 2.743-91 ESKD. Graphic symbols in schemes.
Elements of digital technology]. M.: Gosstandart, 1991.
75 p.
- Kuznetsov, L.A., and V. F. Kuznetsova. 2013. Otsenka
semanticheskoy adekvatnosti tekstov informatsionnym
metodom [Evaluation of the semantic adequacy of texts
by information method]. Informatika i ee Primeneniya —
Inform. Appl. 7(1):19–29.
- Gasparyan, A. V., and A.A. Kirakosyan. 2006. Sistema
sravneniya otpechatkov pal’tsev po lokal’nym priznakam
[Fingerprint comparisons on local characteristics]. Vestnik
RAU. Ser. Fiziko-Matematicheskie i EstestvennyeNauki
[Herald of RAU. Physics,Mathematics, andNatural Sciences
ser.] 2:85–91.
- Swain, M. J., and D.H. Ballard. 1991. Color indexing.
Int. J. Computer Vision 7(1):11–32.
- Sticker, M., and M. Orengo. 1995. Similarity of color
images. SPIE Conference Proceedings 2420:381–392.
- Kuznetsov, L.A., and D.A. Bugakov. 2013. Razvitie
metoda sravneniya i klassifikatsii graficheskikh ob”ektov
[Development of the method of comparison and classification].
Vestnik Komp’yuternykh i Informatsionnykh
Tekhnologiy [Computer and Information Bulletin Technology]
2(104):11–16.
- Shennon, K. 1948. A mathematical theory of communication.
Pt. I, II. Bell. Syst. Techn. J. 27(3):379–423;
27(4):623–656.
- Kuznetsov, L.A., V. F. Kuznetsova, and A. V. Kapnin.
2013. Universal’nyymetricheskiy tezaurus russkogo yazyka
[Universal Russian language thesaurus metric]. Informatika
i ee Primeneniya — Inform. Appl. 7(3):27–35.
[+] About this article
Title
UNIVERSAL TECHNOLOGY OF INFORMATION OBJECTS PROXIMITY ASSESSMENT
Journal
Informatics and Applications
2014, Volume 8, Issue 2, pp 130-144
Cover Date
2014-03-31
DOI
10.14357/19922264140213
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
information object; text; image; probabilisticmodel; semantic similarity; entropy; measure of similarity
Authors
L.A. Kuznetsov ,
Author Affiliations
Russian Presidential Academy of National Economy and Public Administration (Lipetsk Branch), 3 Internatsional’naya
Str., Lipetskaya oblast, Lipetsk 398050, Russian Federation
|