Articles

download PDF

Published:

Apr 11, 2024

DOI:

10.4314/cajost.v6i1.13

Keywords:

Genetic Algorithm, Convolution Neural Network (CNN), IoTs, Unmet Potential Data value

Issue

Vol. 6 No. 1 (2024)

Section

Articles

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Okpako A. Ejaita

Ojie D. Voke

Abstract

Cluster analysis is regarded as one of the most important unsupervised learning tasks, with its natural application in dividing data into meaningful groups, also known as clusters, based on the information in the data by describing the objects in terms of their relationships and capturing the data's natural structure. Many traditional performance evaluation metrics for clustering algorithms abound in the literature, treating various attributes or variables equally when measuring similarity; however, different attributes or variables may contribute differently due to the amount of information they contain, which can vary greatly. Data Value Metric (DVM) is an information theoretic measure based on the concept of mutual information that has been shown to be a good metric for validating data quality and utility in a big data ecosystem and in traditional data. Because it uses a forward selection search strategy, Data Value Metric (DVM) suffers from local minima and loss of diversity in the population; however, hybridizing it with Genetic Algorithm will overcome the problem of local minima because there will be a blend of evolutionary search to ensure a balance between exploration and exploitation of the search space. This paper proposed a hybrid model of the Genetic Algorithm and the Data Value Metric (DVM) as an information theoretic metric for quantifying the quality and utility of variable clustering selection that can be applied to traditional data.

Journal Identifiers

eISSN: 2705-3121
print ISSN: 2705-313X

Article Sidebar