Efficient Data-Driven Rule for Obtaining an Optimal Predictive Function of a Discriminant Analysis
This paper proposes a rule for optimizing a predictive discriminant function (PDF) in discriminant analysis (DA). In this study, we carried out a sequential-stepwise analysis on the predictor variables and a percentage-N-fold cross-validation on the data set obtained from students’ academic records in a university system. The hit rates, P(a) result obtained for the optimized PDF, Z(OPT) calibrated on training and validation sets, when
compared with that of PDF, Z obtained using the conventional rule, showed a significant improvement in terms of how well each PDF classifies cases into values of the categorical dependent. It was also discovered that the optimized PDF, Z(OPT) produces consistent high hit rates with little variability, thereby reducing the problem of overfitting.
Keywords: Optimal Predictive function, Overfitting, sequential-stepwise analysis, percentage-N-fold cross-validation