Main Article Content

A framework for the classification of threat in health care system in Nigeria


E.O. Abengowe
U.O. Ekong
F. Rishamma
S.B. Oyong
C.C. Chidimma

Abstract

Security of Healthcare systems and patients’ records privacy are constantly threatened by malware as they transit on the network. To curb this menace of malware threats, this paper developed a Framework that combined machine learning tools to identify malicious applications. Machine learning tools used include Random forest algorithm, k_nearest_neighbor, Naïve Bayes, Decision tree, and Logistic regression. The dataset used is Network Security laboratory–knowledge discovery in databases (NSL-KDD), obtained from Kaggle, a public data repository. It consists of categorical and numerical features with normal and abnormal labels, though imbalanced. Synthetic Minority Oversampling Technique (SMOTE) was used to balance the dataset. The categorical features were converted to numerical features using One-Hot-Encoder function. The features were then harmonized to a range of [0, 1] using min-max normalization. To preprocess the data, principal component analysis (PCA), which is an extraction technique, was used. It replaced original features with a much smaller feature set, but maintains its characteristics. The preprocessed data was then split into training dataset (80%) with 125,973 records and test dataset (20%) with 22,544 records. The models were trained using the training dataset, and used for prediction. Their predictions were aggregated to soft vote classifier, which was used to classify test dataset to normal or malware labels. Python programming language was used in splitting, training, predicting and evaluating the models. The results indicated that Random Forest Classifier outperformed other models with the highest accuracy of 0.99932 and an exceptional AUC score of 1.0. However, Naive Bayes classifier produced poor metrics with accuracy of 0.39624 and AUC of 0.7419, because it assumed that paired features were independent, which is not always the case in practice, and might have led to its poor performance. The implementation of this framework in healthcare sector, will reduce attacks on digital healthcare records, protect patients’ privacy, and encourage interoperability between healthcare providers. 


Journal Identifiers


eISSN: 2141-3290