Exploring K-Means Clustering Efficiency: Accuracy and Computational Time across Multiple Datasets

Iliyas Karim  Khan; Hanita Daud; Nooraini Zainuddin; Rajalingam Sokkalingam; Abdussamad Abdussamad; Abdus Samad Azad; Mudassar Iqbal; Mudasar Zafar; Atta Ullah; Musarat Elahi; Ahmad Abubakar Suleiman

doi:10.37934/araset.62.3.5769

Authors

Iliyas Karim Khan Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia
Hanita Daud Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia
Nooraini Zainuddin Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia
Rajalingam Sokkalingam Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia
Abdussamad Abdussamad Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia
Abdus Samad Azad Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia
Mudassar Iqbal Department of Mathematical Sciences Faculty of Basic Sciences, Balochistan University of Information Technology, Engineering and Management Sciences (BUITEMS), Quetta 87300, Pakistan
Mudasar Zafar School of Mathematics, Actuarial and Quantitative Studies (SOMAQS), Asia Pacific University of Technology & Innovation (APU), Bukit Jalil, 57000 Kuala Lumpur, Malaysia
Atta Ullah Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia
Musarat Elahi Shaheed Benazir Bhutto Women University Peshawar, Khyber Pakhtunkhwa 00384, Pakistan
Ahmad Abubakar Suleiman Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

DOI:

https://doi.org/10.37934/araset.62.3.5769

Keywords:

Accuracy, efficiency, k-mean clustering, algorithm and dataset

Abstract

In the realm of unsupervised machine learning, clustering stands as a pivotal method in data analysis. However, it grapples with challenges arising from diverse datasets, leading to certain algorithms displaying reduced effectiveness or prolonged execution times on specific data types. The performance of each clustering algorithms depends on both the dataset's sample size and its specific characteristics. Among these algorithms, K-means clustering stands out as a popular choice. It is essential to evaluate its accuracy levels and execution times across various datasets with different sample sizes and features. This paper assesses the precision and efficiency of the K-means clustering algorithm on three distinct datasets, namely seed data, iris data and well log data sourced from GitHub, each characterized by variations in both size and features. The Seed dataset represents three different varieties of wheat seeds, Iris dataset represents measurements of three different iris flowers species and Well log dataset represents Sonic log and Gamma ray data respectively. The aim is to analyse how accurate and efficient K-means algorithm performs across these data sets. The results show that K-means algorithm produces high accuracy and lower computational time to the Well log dataset.

Downloads

Author Biographies

Iliyas Karim Khan , Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

iliyas_22008363@utp.edu.my

Hanita Daud, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

hanita_daud@utp.edu.my

Nooraini Zainuddin, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

aini_zainuddin@utp.edu.my

Rajalingam Sokkalingam, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

raja.sokkalingam@utp.edu.my

Abdussamad Abdussamad, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

abdussamad_22009779@utp.edu.my

Abdus Samad Azad, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

abdus_22009918@utp.edu.my

Mudassar Iqbal, Department of Mathematical Sciences Faculty of Basic Sciences, Balochistan University of Information Technology, Engineering and Management Sciences (BUITEMS), Quetta 87300, Pakistan

mudassar.iqbal@buitms.edu.pk

Mudasar Zafar, School of Mathematics, Actuarial and Quantitative Studies (SOMAQS), Asia Pacific University of Technology & Innovation (APU), Bukit Jalil, 57000 Kuala Lumpur, Malaysia

mudasar.zafar@apu.edu.my

Atta Ullah, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

atta_22000639@utp.edu.my

Musarat Elahi, Shaheed Benazir Bhutto Women University Peshawar, Khyber Pakhtunkhwa 00384, Pakistan

baigmusarat8@gmail.com

Ahmad Abubakar Suleiman, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

ahmad_22000579@utp.edu.my

Exploring K-Means Clustering Efficiency: Accuracy and Computational Time across Multiple Datasets

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

Iliyas Karim Khan , Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

Hanita Daud, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

Nooraini Zainuddin, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

Rajalingam Sokkalingam, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

Abdussamad Abdussamad, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

Abdus Samad Azad, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

Mudassar Iqbal, Department of Mathematical Sciences Faculty of Basic Sciences, Balochistan University of Information Technology, Engineering and Management Sciences (BUITEMS), Quetta 87300, Pakistan

Mudasar Zafar, School of Mathematics, Actuarial and Quantitative Studies (SOMAQS), Asia Pacific University of Technology & Innovation (APU), Bukit Jalil, 57000 Kuala Lumpur, Malaysia

Atta Ullah, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

Musarat Elahi, Shaheed Benazir Bhutto Women University Peshawar, Khyber Pakhtunkhwa 00384, Pakistan

Ahmad Abubakar Suleiman, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

Downloads

Published

How to Cite

Issue

Section

Similar Articles

Most read articles by the same author(s)

araset

THE PUBLISHER

PREP

SUBMISSION

Keywords

JOURNAL METRICS AND INDEXING

DISTRIBUTION OF AUTHORS

Information

Exploring K-Means Clustering Efficiency: Accuracy and Computational Time across Multiple Datasets

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

Iliyas Karim Khan , Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

Hanita Daud, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

Nooraini Zainuddin, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

Rajalingam Sokkalingam, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

Abdussamad Abdussamad, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

Abdus Samad Azad, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

Mudassar Iqbal, Department of Mathematical Sciences Faculty of Basic Sciences, Balochistan University of Information Technology, Engineering and Management Sciences (BUITEMS), Quetta 87300, Pakistan

Mudasar Zafar, School of Mathematics, Actuarial and Quantitative Studies (SOMAQS), Asia Pacific University of Technology & Innovation (APU), Bukit Jalil, 57000 Kuala Lumpur, Malaysia

Atta Ullah, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

Musarat Elahi, Shaheed Benazir Bhutto Women University Peshawar, Khyber Pakhtunkhwa 00384, Pakistan

Ahmad Abubakar Suleiman, Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, 32610 Seri Iskandar, Perak, Malaysia

Downloads

Published

How to Cite

Issue

Section

Similar Articles

Most read articles by the same author(s)

araset

THE PUBLISHER

PREP

SUBMISSION

Keywords

JOURNAL METRICS AND INDEXING

DISTRIBUTION OF AUTHORS

RELATED PUBLICATION

Information