Selection of Important Variables in Principal Component Analysis Using Measures of Multivariate Association
AbstractAs part of Exploratory Analysis of Multivariate data, Principal Componet Analysis (PCA) is generally directed towards inspection and dimensionality reduction of the data such that most of the sample variation is preserved. Hence, to be able to identify the subsets of variables which contain the main features of the entire data and possibly reveal interesting patterns or relationships, constitutes one the major aims of PCA. New selection methods based on Canonical Correlation Analysis and Euclidean distances are proposed. While the criterion (M2) based on Procrustes Analysis, found in literature, identifies structure-bearing variables, particularly for grouped data, the proposed criteria retain those variables that preserve the maximum sample variation and carry whatever unknown multivariate structure, which may be present in the complete data. These methods are evaluated and compared on real as well as Monte Carlo simulation data.
UNISWA Research Journal of Agriculture, Science and Technology Vol 3 (2) 2000: pp 22-31