Scientia Africana

Log in or Register to get access to full text downloads.

Remember me or Register

DOWNLOAD FULL TEXT Open Access  DOWNLOAD FULL TEXT Subscription or Fee Access

Extension of K-Means Algorithm for clustering mixed data

F.E. Onuodu, E.O. Nwachukwu, O. Owolabi


In this work, a new hybrid method has been proposed which extends K-means algorithm to categorical domain and mixed-type attributes. Also proposed is a new dissimilarity measure that uses relative cumulative frequency-based method in clustering objects with mixed values. The dissimilarity model developed could serve as a predictive tool for identifying attributes of objects in mixed datasets. It has been implemented using JAVA programming language and MATLAB. Experiments on real-world datasets show that the new hybrid algorithm is more efficient and more robust when compared with existing ones in terms of accuracy and time complexity. This tool can be used in a  variety of applications such as in agro-based industries, in clinical datasets and in  general information retrieval system (IRS). The new method has been applied on  agro-based datasets of soybean and yeast for forming clusters that could help farmers in the management of crop pests.

Key words: Mixk-meansXFon, Clustering, Mixed data.

AJOL African Journals Online