Informatics and Applications
2016, Volume 10, Issue 1, pp 2-22
DATA ACCESS CHALLENGES FOR DATA INTENSIVE RESEARCH IN RUSSIA
- L. A. Kalinichenko
- A. A. Volnova
- E. P. Gordov
- N. N. Kiselyova
- D. A. Kovaleva
- O. Yu. Malkov
- I. G. Okladnikov
- N. L. Podkolodnyy
- A. S. Pozanenko
- N. V. Ponomareva
- S. A. Stupnikov
- A. Z. Fazliev
Abstract
The goal of this survey is to analyze the global trends of development of massive data collections and related infrastructures in the world aimed at the evaluation of the opportunities for the shared usage of such collections during research, decision making, and problem solving in various data intensive domains (DIDs) in Russia. The representative set of DIDs selected for the survey includes astronomy, genomics and proteomics, neuroscience (human brain investigation), materials science, and Earth sciences. For each of such DIDs, the strategic initiatives (or large projects) in the USA and Europe aimed at creation of big data collections and the
respective infrastructures planned up to 2025 are briefly overviewed. The information technology projects aimed at the development of the infrastructures supporting access to and analysis of such data collections are also briefly overviewed. The set of large data collections included into the survey and expected to be created soon is planned to be used as a reference point for the design and development of the research infrastructures for data management and analysis making them compatible with the foreign open research infrastructures. In particular, the data collections considered in the survey, the goals of their creation and the researches planned to be accomplished based on them make it possible to proceed to the design and implementation of the advanced components of the research infrastructures, such as, for example, conceptualization facilities of the application domains to be investigated in data intensive research, respective metamodels, components intended for data reuse and reproducing of programs and workflows, etc.
[+] References (31)
- Hey, T, S. Tansley, and K. Tolle, eds. 2009. The fourth paradigm: Data-intensive scientific discovery. Redmond, WA: Microsoft Research. 284 p. Available at: http://goo.gl/edvr6W (accessed February 1, 2016).
- Juric, M., and T Tyson. 2015. LSST data management: Entering the era of petascale optical astronomy. High. Astron. 16:675.
- Taylor, A. R. 2015. Data intensive radio astronomy en route to the SKA: The rise ofbig radio data. High. Astron. 16:677.
- Fleming, S. W., F. Abney, T. Donaldson, et al. 2015. Beyond the Prime Directive: The MAST discovery portal and high level science products. American Astronomical Society (AAS) Meeting #225. #336.59.
- Zhelenkova, O., V. Vitkovsky, and T Plyaskina. 2010. Electronic archive of observational data of astrophysi- cal observatory. Russ. J. Digital Libraries 13(4). Available at: http://www.elbib.ru/index.phtml?page=elbib/rus/ journal/2010/part4/ZVP (accessed February 1, 2016).
- Kardashev, N. S., V. V. Khartov, V. V. Abramov, et al. "Ra- dioAstron" - a telescope with a size of 300 000 km: Main parameters and first observational results. Astron. Rep. 57(3):153-194.
- Shustov, B.M., A.I. Gomez de Castro, M. Sachkov, et al. 2014. WSO-UV progress and expectations. Astrophys. Space Sci. 354(1):155-161.
- Kardashev, N. S., I. D. Novikov, V. N. Lukash, et al. 2014. Review of scientific topics for the Millimetron space ob-servatory. Physics-Uspekhi57(12):1199-1228.
- Why neuroinformatics? International Neuroinformatics Coordinating Facility. Available at: http://www.incf.
org/about/why-neuroinformatics (accessed February 1, 2016).
- Human Brain Project. Available at: https://www. humanbrainproject.eu (accessed February 1, 2016).
- Human Connectome Project. WU-Minn HCP 500 Subjects Data Release: Reference manual. Available at: http://goo.gl/FsfmUb (accessed February 1, 2016).
- Hawrylycz, M. J., E. S. Lein, A. L. Guillozet-Bongaarts, et al. 2012. An anatomically comprehensive atlas of the adult human brain transcriptome. Nature 489:391-399.
- Gomez-Cabrero, D., I. Abugessaisa, D. Maier, A. Teschendorff, M. Merkenschlager, A. Gisel, E. Balle- star, E. Bongcam-Rudloff, A. Conesa, and J. Tegner. 2014. Data integration in the era of omics: Current and future challenges. BMC Syst. Biol. 8(2):I1.
- Greene, C.S., J. Tan, M. Ung, J. H. Moore, and C. Cheng. 2014. Big data bioinformatics. J. Cell. Physiol. 229(12):1896-1900.
- Herland, M., T. M. Khoshgoftaar, andR.Wald. 2014.Are- view of data mining using big data in health informatics. J. Big Data 1(2). 35 p.
- Kamesh, D. B. K., V. Neelima, andR. R. Priya. 2015. Are- view of data mining using bigdata in health informatics. Int. J. Sci. Res. Publ. 5(3).
- Genome 10K community of scientists. 2009. Genome 10K: A proposal to obtain whole-genome sequence for
10 000 vertebrate species. J. Heredity 100(6):659-674.
- Davis-Dusenbery, B., Z. Onder, D. Locke, and D. Kural.
2015. Petabyte-scale cancer genomics in the cloud. TCGA Symposium Oral Presentations. 34.
- Materials Genome Initiative for Global Competitiveness. Available at: http://www.whitehouse.gov/sites/ default/files/microsites/ostp/materials_genome_initiati ve-final.pdf (accessed February 1, 2016).
- The Materials Data Facility. Available at: http://www. nationaldataservice.org/mdf/ (accessed February 1, 2016).
- Versailles Project on Advanced Materials and Standards (VAMAS). Available at: http://www.vamas.org/ (accessed February 1, 2016).
- Belov, G.V., V. S. Iorish, and V. S. Yungman. 1999. IVTANTHERMO for Windows - database on ther-modynamic properties and related software. CALPHAD 23(2):173-180.
- Kiselyova, N. N., V. A. Dudarev, andV. S. Zemskov. 2010. Computer information resources in inorganic chemistry and materials science. Russ. Chem. Rev. 79(2):145-166.
- Copernicus. Observing the Earth. Available at: http:// www.esa.int/Our_Activities/Observing_the_Earth/ Copernicus/Overview3 (accessed February 1, 2016).
- Ramapriyan, H. K., J. Behnke, E. Sofinowski, D. Lowe, and M. A. Esfandiari. 2010. Evolution of the Earth Observing System (EOS) data and Information System (EOSDIS). Standard-based data and Information systems for Earth observation. Eds. L. Di and H. K. Ramapriyan. Lecture notes in geoinformation and cartography ser. Berlin-Heidelberg: Springer. 63-92.
- Schnase, J. L., D. Q. Duffy, M. A. McInerney, et al. 2014. Climate analytic as a service. Conference on Big Data from Space BiDS'14 Proceedings. Luxembourg: Publications Office of the European Union. 90-93.
- Dubernet, M.L., V. Boudon, J. L. Culhane, et al. 2010. Virtual atomic and molecular data centre. J. Quant. Spec- trosc. Ra. Transfer 111(15):2151-2159.
- Rixon, G., M.-L. Dubernet, N. Piskunov, et al. 2011. VAMDC - the Virtual Atomic and Molecular Data Centre - a new way to disseminate atomic and molecular data - VAMDC Level 1 Release. J. Phys. Conf. Ser. 1344:107-115.
- National Data Service (NDS). Available at: http://www. nationaldataservice.org/ (accessed February 1, 2016).
- Gangler, E. 2014. Big data challenge posed by the Large Synoptic Survey Telescope. Big data technology in the service of the Gaia data processing. Conference on Big Data from Space BiDS'14 Proceedings. Luxembourg: Publications Office of the European Union. 194-197.
- Frezouls, B., andP.-M. Brunet. 2014. Big data technology in the service of the Gaia data processing. Conference on Big Data from Space BiDS'14 Proceedings. Luxembourg: Publications Office of the European Union. 198-201.
[+] About this article
Title
DATA ACCESS CHALLENGES FOR DATA INTENSIVE RESEARCH IN RUSSIA
Journal
Informatics and Applications
2016, Volume 10, Issue 1, pp 2-22
Cover Date
2016-01-30
DOI
10.14357/19922264160101
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
fourth paradigm; data intensive domains; research infrastructures; data collections; big data
Authors
L. A. Kalinichenko , ,
A. A. Volnova ,
E. P. Gordov ,
N. N. Kiselyova ,
D. A. Kovaleva ,
O. Yu. Malkov ,
I. G. Okladnikov ,
N. L. Podkolodnyy ,
A. S. Pozanenko ,
N. V. Ponomareva ,
S. A. Stupnikov , and A. Z. Fazliev
Author Affiliations
Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian
Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation
Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University,
1-52 Leninskiye Gory, GSP-1, Moscow 119991, Russian Federation
Space Research Institute of the Russian Academy of Sciences, 84/32 Profsoyuznaya Str., Moscow 117997, Russian Federation
Siberian Center for Environmental Research and Training, Institute of Monitoring of Climatic and Ecological Systems of the Siberian Branch of the Russian Academy of Sciences, 10/3 Akademicheski Av., Tomsk 634055, Russian Federation
A. A. Baikov Institute of Metallurgy and Materials Science of the Russian Academy of Sciences, 49 Leninsky Av., GSP-1, Moscow 119991, Russian Federation
Institute of Astronomy of the Russian Academy of Sciences, 48 Pyatnitskaya Str., Moscow 119017, Russian Federation
Center for Bioinformatics, Federal Research Research Center Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, 10 Acad. Lavrentyeva Av., Novosibirsk 630090, Russian Federation
Research Center of Neurology, 80 Volokolamskoe Shosse, Moscow 125367, Russian Federation
Integrated Information Systems Center, Institute of Atmospheric Optics of the Siberian Branch of the Russian Academy of Sciences, 1 Acad. Zuev Sq., Tomsk 634055, Russian Federation
|