Informatics and Applications
2020, Volume 14, Issue 4, pp 69-76
EXTRACTION OF CONFIDENTIALITY MARKERS FROM TEXTS UNDER CONDITIONS OF HIGH UNCERTAINTY IN SYSTEMS WITH DATA INTENSIVE USAGE
- V. I. Budzko
- V. V. Yadrintsev
- I. V. Sochenkov
- V. I. Korolev
- V. G. Belenkov
Abstract
The main tasks, the results of the solution of which are reflected in the article, are associated with the formation of confidentiality markers when they are used in data-intensive systems under conditions when the composition and structure of the protected information cannot be determined in advance due to the lack of data or the high dynamics of their change, or their definition is not advisable due to the large number of entities whose information is subject to protection. In this paper, an approach is proposed for the formation of confidentiality markers for text materials in the indicated conditions. The article presents the semantic text analysis, which forms confidentiality markers when used to ensure information security in data-intensive systems under high uncertainty in the composition and structure of protected information. The obtained experimental results show that practical implementation of the considered approach in data-intensive systems is promising.
[+] References (11)
- Budzko, V. I., V. G. Belenkov, and S. V. Borokhov. 2017. Problemy obespecheniya informatsionnoy bezopasnosti pri intensivnom ispol'zovanii dannykh [Problems of en-suring information security with intensive-data use]. Sci-entific and Technical Conference (International) "Informa-tion Technologies and Mathematical Modeling of Systems" Proceedings. Odintsovo. 122-124.
- Belenkov, V. G., S. V. Borokhov, V. I. Budzko, P. A. Keyer, and V. I. Korolev. 2017. Voprosy obespecheniya informatsionnoy bezopasnosti informatsionnykh sistem, realizu- yushchikh intensivnoe ispol'zovanie dannykh [Issues of ensuring information security of information systems that implement intensive data use]. 19th Conference (International) "Data Analytics and Management in Data Intensive Domains" Proceedings. Moscow: FRC CSC RAS. 155- 158.
- Budzko, V. I., V. I. Korolev, and V. G. Belenkov. 2018. Elementy konfidentsial'nosti i perspektivy ikh ispol'zovaniya v sistemakh, realizuyushchikh intensivnoe ispol'zovanie dannykh [Privacy elements and prospects of their use in data intensive systems]. Highly Available Systems 14(4):55-60.
- Budzko, V.I., V.G. Belenkov, and V.I. Korolev. 2018. Ob odnom kontseptual'nom podkhode k zashchite in- formatsii v sistemakh, realizuyushchikh DID [About one conceptual approach to information security in DID- systems]. Scientific and Technical Conference (Internation-al) "Information Technologies and Mathematical Modeling of Systems"Proceedings. Odintsovo. 43-46.
- Budzko, V.I., V.G. Belenkov, and V.I. Korolev. 2018. Ob osobennostyakh ispol'zovaniya sredstv i metodov OIB v sistemakh, realizuyushchikh DID [On the features of the use of tools and methods of information security in DID-systems]. Scientific and Technical Conference (International) "Information Technologies and Mathematical Modeling of Systems" Proceedings. Odintsovo. 47-50.
- Budzko, V. I., V.I. Korolev, and V.G. Belenkov. 2019. Osobennosti ispol'zovaniya markerov konfidentsial'nosti v sistemakh intensivnogo ispol'zovaniya dannykh [Fea-tures of use privacy tokens in systems that implement intensive use of data]. Highly Available Systems 15(2):57- 65.
- Budzko, V.I., V.I. Korolev, and V.G. Belenkov. 2020. Arkhitektura instrumental'nogo kompleksa izvlecheniya informatsii s uchetom vstroennykh ekstraktov konfi- dentsial'nosti i integratsii izvlechennykh dannykh [Ar-chitecture of the information extraction tool complex with built-in privacy elements and integration of extracted data]. Highly Available Systems 16(2):5-21.
- Dorogoy, D. S., A.V. Sharov, A. A. Tuzovskiy, and
I. A. Tereshchenko. 2018. Sposob obucheniya klassifikatora, prednaznachennogo dlya opredeleniya kategorii dokumenta [A method of training a classifier designed to determine the category of a document]. Patent RF No. 2672395.
- Dorogoy, D. S. 2017. Sistema i sposob opredeleniya teksta, soderzhashchego konfidentsial'nye dannye [System and method for determining text containing confidential data]. Patent RF No. 2665915.
- Zubarev, D. V., and I. V. Sochenkov. 2014. Using sentence similarity measure for plagiarism source retrieval. CEUR Workshop Proceedings: Cross Language Evaluation Forum. 1180:1027-1034.
- Zubarev, D.V., I. V. Sochenkov, I. A. Tikhomirov, and
O. G. Grigoriev. 2017. Double funding of scientific projects: Similarity and plagiarism detection. 19th Confer-ence (International) "Data Analytics and Management in Data Intensive Domains"Proceedings. Moscow: FRC CSC RAS. 282-285.
[+] About this article
Title
EXTRACTION OF CONFIDENTIALITY MARKERS FROM TEXTS UNDER CONDITIONS OF HIGH UNCERTAINTY IN SYSTEMS WITH DATA INTENSIVE USAGE
Journal
Informatics and Applications
2020, Volume 14, Issue 4, pp 69-76
Cover Date
2020-12-30
DOI
10.14357/19922264200410
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
confidentiality marker; information security; data-intensive domains; topical cluster; semantics; data leak prevention; intelligent security tasks; text classification; detection of text reuse
Authors
V. I. Budzko , V. V. Yadrintsev , , I. V. Sochenkov , V. I. Korolev , and V. G. Belenkov
Author Affiliations
Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str.,
Moscow 119333, Russian Federation
Peoples' Friendship University of Russia (RUDN University), 6 Miklukho-Maklaya Str., Moscow 117198, Russian Federation
|