Informatics and Applications
2018, Volume 12, Issue 1, pp 89-94
EXPLORATORY PATENT SEARCH
- I. Sochenkov
- D. Zubarev
- I. Tikhomirov
Abstract
The paper presents an effective method for topically similar document retrieval. The exploratory patent search based on this method is proposed. The developed method reduces complexity and time of patent expertise providing the computer assistance of patent search and analysis. The phrases extracted by the parser as well as single lexemes are used as descriptors for a document. This approach prevents exponential growth of the feature space and provides effective indexing even for large text collections. The results of experiments show that the proposed method significantly outperforms the basic keyword-based approach. Conclusions are made about the prospects of using the method for solving other problems such as source retrieval for plagiarism detection and full-text clustering.
[+] References (12)
- Marchionini, G. 2006. Exploratory search: From finding to understanding. Commun. ACM49(4):41-46.
- White, R.W., and R. A. Roth. 2009. Exploratory search: Beyond the query-responseparadigm. Synthesis lectures on information concepts, retrieval, and services ser. Morgan & Claypool Publs. 1(1). 98 p.
- Osipov, G., I. Smirnov, I. Tikhomirov, I. Sochenkov,
A. Shelmanov, and A. Shvets. 2014. Information retrieval for R&D support. Professional search in the modern world. Eds. G. Paltoglou, F Loizides, and P. Hansen. Lecture notes in computer science ser. Springer. 8830:45-69.
- Vorontsov, K.V. 2014. Additive regularization for topic models of text collections. Dokl. Math. 89(3):301-304.
- Moloshnikov, I., A. Sboev, D. Gudovskikh, and R. Ry- bka. 2015. An algorithm of finding thematically similar documents with creating context-semantic graph based on probabilistic-entropy approach. Procedia Comput. Sci. 66:297-306.
- Nokel, M., and N. Loukachevitch. 2016. Accounting ngrams and multi-word terms can improve topic models. 54th Annual Meeting of the Association for Computational Linguistics. Proceedings of 12th Workshop on Multiword Expressions. Stroudsburg, PA: ACL. 44-49.
- Glauner, P.O., J. Iwaszkiewicz, J.Y.L. Meur, and T Simko. 2013 Use of Solr and Xapian in the Invenio
document repository software. ArXivpreprint. Available at: https://arxiv.org/pdf/1310.0250.pdf (accessed November 17, 2017).
- Grainger, T., and T. Potter. 2014. Solr in action. New York, NY: Manning Publications. 664 p.
- Ilyinsky, S., M. Kuzmin, A. Melkov, and I. Segalovich. 2002. An efficient method to detect duplicates of Web documents with the use of inverted index. 11th World Wide Web Conference (International) Proceedings. New York, NY: ACM. 4 p.
- Ageev, M.S., and B.V. Dobrov. 2011. Metod effektivnogo rascheta matritsy blizhayshikh sosedey dlya polnotekstovykh dokumentov [An efficient nearest neighbors search algorithm for full-text documents]. Vestnik SPb un-ta. Ser. 10. Prikladnaya matematika. Informatika [Vest- nik of Saint Petersburg University. Applied mathematics. Computer science. Control processes] 3:72-84.
- Shvets, A., D. Devyatkin, I. Sochenkov, I. Tikhomirov, K. Popov, and K. Yarygin. 2015. Detection of current research directions based on full-text clustering. Science and Information Conference. London: IEEE. 483-488.
- Zubarev, D., and I. Sochenkov. 2014. Using sentence similarity measure for plagiarism source retrieval. CEUR Workshop Proceedings: CLEF 2014 (Working Notes). 1027-1034. Available at: http://ceur-ws.org/Vol- 1180/CLEF2014wn-Pan-ZubarevEt2014.pdf (accessed November 17, 2017).
[+] About this article
Title
EXPLORATORY PATENT SEARCH
Journal
Informatics and Applications
2018, Volume 12, Issue 1, pp 89-94
Cover Date
2018-03-30
DOI
10.14357/19922264180111
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
exploratory search; patent search; topic modeling; topically similar document retrieval; search and analytical engines
Authors
I. Sochenkov , ,
D. Zubarev , , and I. Tikhomirov
Author Affiliations
Institute for Systems Analysis, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
Skolkovo Institute of Science and Technology, 3 Nobelya Str., Moscow 121205, Russian Federation
Peoples' Friendship University of Russia (RUDN University), 6 Miklukho-Maklaya Str., Moscow 117198, Russian Federation
|