Main Article Content

A comparative analysis of haemoglobin variants using machine learning algorithms


A. A. Okandeji
O. F. Odeyinka
A. A. Sogbesan
N. O. Ogunye

Abstract

In medical sciences, to ascertain the origin of a sickness, professionals utilize their expertise and knowledge to analyze a person's symptoms and indications. These symptoms (indicators) are threshold values that health specialists use to determine the cause of the illness by comparing a specific proportion of measurements to where a healthy population would fall. Consequently, diagnostic mistakes occur as a result of inaccuracy and imprecision. This study utilizes machine learning to categorize haemoglobin variations. Specifically, the data set used in this study includes 752 complete blood count laboratory analyses of adult patients aged eighteen and above obtained from Lagos State University Teaching Hospital (LASUTH). Multiple machine learning methods were utilized for classification from which five of the methods employed were examined and assessed. Comparative analysis was done using the five algorithms (K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and Naive Bayes (NB)). Contrary to work done by previous researchers, it was observed that the SVM model showed the best classification accuracy of 94.7%, with an F1-score of 94.5%, precision of 94.8%, recall of 94.7%, specificity of 97.3%, and area under curve (AUC) of 99.0%. Among the other models considered, the RF model gave the least accuracy result of 87.4%. The study shows that the support vector machine algorithm outperforms the other classifiers in terms of accuracy when predicting haemoglobin variants given the haematological parameters.


Journal Identifiers


eISSN: 2467-8821
print ISSN: 0331-8443