Informatics and Applications
2018, Volume 12, Issue 3, pp 56-61
SUPERVISED LEARNING CLASSIFICATION OF DATA TAKING INTO ACCOUNT PRINCIPAL COMPONENT ANALYSIS
Abstract
The article examines questions of supervised learning classification of data taking into account principal component analysis (PCA) results. Oonstruction of a Bayesian classifier becomes possible after representation of covariances through the parameters of the probabilistic PCA model. The case of singular data distributions is singled out; for this case, it is suggested to estimate the parameters of the model under constraints on the eigenvalues of covariance matrices. The quality of classification is studied in respect to the actual data dimension.
It is demonstrated that, when correctly assigned, the classifier has the least error probabilities. Exceeding the best value of the dimension usually worsens the quality of the classification to a lesser extent than its underestimation.
The mixture of probabilistic principal component analyzer allows modeling big data by means of a relatively small number of free parameters. The number of free parameters can be controlled by choosing the latent dimension of the data.
[+] References (6)
- Tipping, M. E., and C. M. Bishop. 1999. Mixtures of probabilistic principal component analyzers. Neural Comput.
11(2):443–482.
- Krivenko, M.P. 2011. Prikladnye metody otsenivaniya
raspredeleniya mnogomernykh dannykh maloy vyborki [Applied methods for estimating the distribution of small sample multidimensional data]. —Moscow: IPIRAN. 146p.
- Wu, C.F.J. 1983. On convergence properties of the EM
algorithm. Ann. Stat. 11:95–103.
- Nettleton, D. 1999. Convergence properties of the EM
algorithm in constrained parameter spaces. Can. J. Stat.
27(3):639–648.
- Tou,J., and R.C. Gonzalez. 1974. Pattern recognition principles. Reading, MA: Addison-Wesley Publ. Co. 377p.
- Ormoneit, D., and V. Tresp. 1996. Improved gaussian mixture
density estimates using Bayesian penalty terms and
network averaging. Eds. D.S. Touretzky, M.C. Mozer,and
M.E. Hasselmo. Advances in neural information processing
systems. Cambridge, MA: MIT Press. 8:542–548.
[+] About this article
Title
SUPERVISED LEARNING CLASSIFICATION OF DATA TAKING INTO ACCOUNT PRINCIPAL COMPONENT ANALYSIS
Journal
Informatics and Applications
2018, Volume 12, Issue 3, pp 56-61
Cover Date
2018-08-30
DOI
10.14357/19922264180308
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
principal component analysis; mixtures of normal distributions; EM algorithm; supervised learning classification
Authors
M. P. Krivenko
Author Affiliations
Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
|