The focus of this thesis is the common principal component (CPC) model, the generalization of principal components to several populations. Common principal components refer to a group of multidimensional datasets such that their inner products share the same eigenvectors and are therefore simultaneously diagonalized by a common decorrelator matrix. Common principal component analysis is essentially applied in the same areas and analysis as its one-population counterpart. The generalization to multiple populations comes at the cost of being more mathematically involved, and many problems in the area remains to be solved.
This thesis consists of three individual papers and an introduction chapter.In the first paper, the performance of two different estimation methods of the CPC model is compared for two real-world datasets and in a Monte Carlo simulation study. The second papers show that the orthogonal group and the Haar measure on this group plays an important role in PCA, both in single- and multi-population principal component analysis. The last paper considers using common principal component analysis as a tool for imposing restrictions on system-wise regression models. When the exogenous variables of a multi-dimensional model share common principal components, then each of the marginal models in the system is, up to their eigenvalues, identical. They henceform a class of regression models situated in between the classical seemingly unrelated regressions, where each set of explanatory variables is unique, and multivariate regression, where each marginal model shares the same common set of regressors.
Common principal components (CPCs) are often estimated using maximum likelihood estimation through an algorithm called the Flury–Gautschi (FG) Algorithm. Krzanowski proposed a simpler estimation method via a principal component analysis of a weighted sum of the sample covariance matrices. These methods are compared for real-world datasets and in a Monte Carlo simulation. The real-world data is used to compare the selection of a common eigenvector model and the estimated coefficients. The simulation study investigates how the accuracy of the methods is affected by autocorrelation, the number of covariance matrices, dimensions, and sample sizes for multivariate normal and chi-square distributed data. The findings in this article support the use of Krzanowski’s method in situations where the CPC assumption is appropriate.