Big Data: Issues and Challenges in Clustering Data Visualization

Authors

  • Ummu Hani’ Hair Zaki Faculty of Computing, Universiti Teknologi Malaysia, 81310 Johor Bahru, Johor, Malaysia
  • Izyan Izzati Kamsani Faculty of Computing, Universiti Teknologi Malaysia, 81310 Johor Bahru, Johor, Malaysia
  • Ahmad Firdaus Ahmad Fadzil College of Computing, Informatics and Media, Universiti Teknologi Mara Cawangan Melaka (Kampus Jasin), 77300 Merlimau, Melaka, Malaysia
  • Zainura Idrus Faculty of Computer and Mathematical Science, Universiti Teknologi Mara, 40450 Shah Alam, Selangor, Malaysia
  • Eser Kandogan Megagon Labs, 444 Castro St #900, Mountain View, CA 94041, United States

DOI:

https://doi.org/10.37934/araset.51.1.150159

Keywords:

Big data, Clustering visualization, Geometric projection, Star coordinate

Abstract

In the era of big data, the continuous generation of data from various fields has resulted in large and complex datasets. These datasets often come in diverse formats and structures, including unstructured or semi-structured data. Despite the wide availability of big data, high dimensionality remains a significant challenge for analysing and understanding the data for various purposes. Clustering analysis plays a crucial role in data analysis and visualization by uncovering hidden patterns and structures within datasets. However, several challenges hinder the effectiveness of clustering analysis, including data dimensionality, selection of appropriate clustering algorithms, determining the optimal number of clusters, interpreting the results, and handling outliers. This paper aims to explore these challenges and presents preferable visualization techniques that aid in visualizing and interpreting clustering results. By addressing these challenges, including the difficulty of handling outliers and the struggles with high-dimensional datasets, and employing effective visualization techniques, researchers and practitioners can enhance their understanding and utilization of clustering analysis in data analysis.

Downloads

Download data is not yet available.

Author Biographies

Ummu Hani’ Hair Zaki, Faculty of Computing, Universiti Teknologi Malaysia, 81310 Johor Bahru, Johor, Malaysia

hanizaki7@gmail.com

Izyan Izzati Kamsani, Faculty of Computing, Universiti Teknologi Malaysia, 81310 Johor Bahru, Johor, Malaysia

izyanizzati@utm.my

Ahmad Firdaus Ahmad Fadzil, College of Computing, Informatics and Media, Universiti Teknologi Mara Cawangan Melaka (Kampus Jasin), 77300 Merlimau, Melaka, Malaysia

firdausfadzil@uitm.edu.my

Zainura Idrus, Faculty of Computer and Mathematical Science, Universiti Teknologi Mara, 40450 Shah Alam, Selangor, Malaysia

zainura@tmsk.uitm.edu.my

Published

2024-09-04

Issue

Section

Articles

Most read articles by the same author(s)