The trade-off between the PLSR and PCR methods for modeling data with collinear structure

  • W.B. Yahya
  • K.O. Olorede
  • M.K. Garba
  • A.W. Banjoko
  • K.A. Dauda
Keywords: PLS Regression, Principal Component Regression, Root Mean Square Error of Prediction, Leave-One- Out Cross-Validation (LOOCV).


This paper investigates the partial least squares regression (PLSR) and principal component regression (PCR) methods as versatile alternative regression techniques when the use of the ordinary least squares method breaks down. Emphasis is more on the situation where the predictor variables are evidently correlated. Data sets with Gaussian non-orthogonal predictor variables were simulated at different sample sizes ranging from 20 to 1000 to examine the performance of the two regression types under varying situations. The data were randomly partitioned into training and test sets with both PLSR and PCR models constructed on the training sets while their performances were evaluated on the test sets using themean square error of predictions and other indices. At each fit of the models, the leave-one-out cross-validation technique was employed to enhance the efficiency and stability of the fitted models. Results from the simulation studies revealed the goodness of the two regression methods but at varying degrees of accuracy. More importantly, it is evident from the results that though, both the PLSR and PCR techniques yielded good regression models, the PLSRtechniqueis consistently more efficient on the test datain terms of good predictions than the PCR method irrespective of sample sizes. Also in terms of model parsimony, the PLSR technique yielded efficient regression models with relatively fewer latent components than the PCR method. Data sets on the performance of M.Sc. graduates from the Department of Statistics, University of Ilorin, Nigeria during the 2012 academic session were used to validate the results from the Monte Carlo studies.


Journal Identifiers

eISSN: 1116-4336