Informatics and Applications

2018, Volume 12, Issue 3, pp 56-61

SUPERVISED LEARNING CLASSIFICATION OF DATA TAKING INTO ACCOUNT PRINCIPAL COMPONENT ANALYSIS

  • M. P. Krivenko

Abstract

The article examines questions of supervised learning classification of data taking into account principal component analysis (PCA) results. Oonstruction of a Bayesian classifier becomes possible after representation of covariances through the parameters of the probabilistic PCA model. The case of singular data distributions is singled out; for this case, it is suggested to estimate the parameters of the model under constraints on the eigenvalues of covariance matrices. The quality of classification is studied in respect to the actual data dimension.
It is demonstrated that, when correctly assigned, the classifier has the least error probabilities. Exceeding the best value of the dimension usually worsens the quality of the classification to a lesser extent than its underestimation.
The mixture of probabilistic principal component analyzer allows modeling big data by means of a relatively small number of free parameters. The number of free parameters can be controlled by choosing the latent dimension of the data.

[+] References (6)

[+] About this article