Improved Principal Component Analysis and Linear Discriminant Analysis for the Determination of Origin of Coffee Beans using

  • Endale Deribe Jiru Addis Ababa University
  • Berhanu Guta Wordofa Addis Ababa University
  • Mesfin Redi-Abshiro Addis Ababa University
Keywords: Chlorogenic acid and Fatty acid; Classification; Dimensionality reduction; Linear discriminant analysis; Principal component analysis


In this work an improved Principal Component Analysis (pca) method is used for better determination of geographical origins of Ethiopian Green Coffee Beans. In the commercially available and widely employed pca methods the dataset is commonly normalized using Z-score procedure, which reduces the influence of the spread of data (or dispersion degree differences) on principal components (pcs). In the improved method, a new normalization procedure is introduced with the aim to improve the spread (dispersion) of data points around the mean. The pcs computed from the improved procedure could significantly better reflect information of the original dataset. The dispersion degree information in the original dataset was retained relatively much by using the improved pca than the Z-score-based pca. The improved pca was then used to identify the most discriminating variables corresponding to the coffee samples and, based on that, Linear Discrimination Analysis (lda) model was developed to classify and predict samples. The recognition and prediction abilities of the improved pca and lda at regional level respectively were 95.7% and 94% (using Chlorogenic Acids (cga s) content), 91% and 97% (using Fatty Acids (FA) content), 99% and 100% (and using the combined cga and FA contents). Mehari et al. (2016, 2019) reported recognition and prediction of the pca, they applied on the same dataset, at regional level were 91% and 90% (using cga s content) and 95% and 92 % (using fas content), respectively. The result reveals that the newly introduced method is superior and the best discriminations of coffee beans were achieved. The combined analysis of cga and fa concentrations is a useful tool for the determination of origin of coffee beans, and we recommend that the concerned bodies should use it to address the characterization, classification and authentication of Ethiopian coffee beans according to their geographical origins.

Author Biographies

Endale Deribe Jiru , Addis Ababa University

Department of Mathematics

Berhanu Guta Wordofa , Addis Ababa University

Department of Mathematics

Mesfin Redi-Abshiro , Addis Ababa University

Department of Chemistry

Research articles

Journal Identifiers

eISSN: 2520-7997
print ISSN: 0379-2897