Systems and Means of Informatics
2019, Volume 29, Issue 3, pp 4-15
SELECTING THE DIMENSIONALITY FOR MIXTURE OF PROBABILISTIC PRINCIPAL COMPONENT ANALYZERS
Abstract
The article considers the problems of choosing structural parameters characterizing the model of a mixture of probabilistic principal component analyzers, namely, the number of elements of the mixture and the dimensions of these elements. Among the set of approaches used in practice for the task of classifying data, only sampling management methods are actually available.
To implement the choice of dimensions, it is proposed to use a combination of the known methods for model selecting. The mixture of probabilistic principal component analysis allows one to model bulk data using a relatively small number of free parameters. The number of free parameters can be controlled by selecting the latent dimension of the data.
[+] References (16)
- Krivenko, M. P. 2018. Obuchaemaya klassifikatsiya dannykh s uchetom analiza glavnykh component [Supervised learning classification of data taking into account principal component analysis]. Informatika i ee Primeneniya - Inform. Appl. 12(3):56-61.
- Tipping, M. E., andC. M. Bishop. 1999. Mixtures of probabilistic principal component analyzers. Neural Comput. 11(2):443-482.
- Zellner, A. 1971. An introduction to Bayesian inference in econometrics. New York, NY: Wiley. 431 p.
- Evans, M., and T. Swartz. 2000. Approximating integrals via Monte Carlo and dterministic method. New York, NY: Oxford University Press Inc. 290 p.
- Kass, R. E., and A. E. Raftery. 1995. Bayes factors. J. Am. Stat. Assoc. 90(430):773- 795.
- Minka, T.P. 2000. Automatic choice of dimensionality for PCA. Advances in neural processing systems 13. Eds. T.K. Leen, T. G. Dietterich, and V. Tresp. MIT Press. 598-604. Available at: http://papers.nips.cc/paper/1853-automatic-choice-of- dimensionality-for-pca.pdf (accessed May 14, 2019).
- Hoyle, D. C. 2008. Automatic PCA dimension selection for high dimensional data and small sample sizes. J. Mach. Learn. Res. 9:2733-2759.
- Nakajima, S., M. Sugiyama, and D. Babacan. 2011. On Bayesian PCA: Automatic dimensionality selection and analytic solution. 28th Conference (International) on Machine Learning Proceedings. Bellevue, WA. 497-504. Available at: http://www.icml- 2011.org/ papers/337Jem I pa per. pdf?CFID=122408014&CFTC)KEN=5f7f69b335b8f cd0-38A2E56E-A506-DB5F-F9185D08D5EE991A (accessed May 14, 2019).
- Raftery, A. E. 1993. Approximate Bayes factors and accounting for model uncertainty in generalized linear models. Technical Report 255. University of Washington, Department of Statistics. 45 p. Available at: http://citeseerx.ist.psu.edu/viewdoc/ download?doi=10.1.1.142.9000&rep=rep1&type=pdf (accessed May 14, 2019).
- Chung, H.-Y., K.-W. Lee, andJ.-Y. Koo. 1996. A note on bootstrap model selection criterion. Stat. Probabil. Lett. 26(1):35-41.
- Arlot, S., and A. Celisse. 2010. A survey of cross-validation procedures for model selection. Statistics Surveys 4:40-79.
- Krivenko, M.P. 2017. Obuchaemaya klassifikatsiya nepolnykh klinicheskikh dan- nykh [Supervised learning classification of incomplete clinical data]. Informatika i ee Primeneniya - Inform. Appl. 11 (3):27-33.
- Jacques, J., C. Bouveyron, S. Girard, O. Devos, and L. Duponchel. 2010. Gaussian mixture models for the classification of high-dimensional vibrational spectroscopy data. J. Chemometr. 24(11-12):719-727.
- Bishop, C. M. 1998. Bayesian PCA. Advances in neural information processing systems 11. Eds. M.J. Kearns, S. A. Solla, and D. A. Cohn. - MIT Press. 382-388. Available at: http://papers.nips.cc/paper/1549-bayesian-pca.pdf (accessed May 14, 2019).
- Bro, R., K. Kjeldahl, A.K. Smilde, and H.A.L. Kiers. 2008. Cross-validation of component models: A critical look at current methods. Anal. Bioanal. Chem. 390(5): 1241-1251.
- Josse, J., and F. Husson. 2012. Selecting the number of components in principal component analysis using cross-validation approximations. Comput. Stat. Data An. 56(6g):1869-1879.
[+] About this article
Title
SELECTING THE DIMENSIONALITY FOR MIXTURE OF PROBABILISTIC PRINCIPAL COMPONENT ANALYZERS
Journal
Systems and Means of Informatics
Volume 29, Issue 3, pp 4-15
Cover Date
2019-10-30
DOI
10.14357/08696527190301
Print ISSN
0869-6527
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
probabilistic principal component analysis (PPCA); mixtures of PPCA; model selection criterion; bootstrap; cross-validation
Authors
M. P. Krivenko
Author Affiliations
Institute of Informatics Problems, Federal Research Center "Computer Science
and Control", Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
|