Comparing the Effectiveness and Efficiency of Machine Learning Models for Spam Detection on Twitter

Stephanie  Chua; Amy    Tan; Puteri Nor Ellyza     Nohuddin; Mohd Hanafi    Ahmad Hijazi

doi:10.37934/araset.60.2.153164

Authors

Stephanie Chua Faculty of Computer Science and Information Technology, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia
Amy Tan Faculty of Computer Science and Information Technology, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia
Puteri Nor Ellyza Nohuddin Higher Colleges of Technology, Sharjah Women’s College, 79799 Abu Dhabi, United Arab Emirates
Mohd Hanafi Ahmad Hijazi Faculty Of Computing and Informatics, Universiti Malaysia Sabah, 88400 Kota Kinabalu, Sabah, Malaysia

DOI:

https://doi.org/10.37934/araset.60.2.153164

Keywords:

Twitter, spam, text mining, machine learning

Abstract

A comprehensive study focused on the efficiency and effectiveness of machine learning models for Twitter spam detection was presented in this research. Spam detection on social media platforms is not only vital for user experience but also poses computational challenges due to the vast and dynamic nature of Twitter data. This investigation encompassed a range of machine learning models, including Naive Bayes (NB), Support Vector Machine (SVM), Logistic Regression (LR), k-Nearest Neighbours (KNN), and Decision Trees (DT). Their performances were scrutinized across two critical dimensions: classification accuracy and computational efficiency, as measured by the time taken for model execution. The results of the analysis revealed valuable insights into model performance. The NB and LR models emerged as the most computationally efficient models, with execution times ranging from 1.016 to 1.949 seconds. These models offered an attractive balance between speed and accuracy, making them suitable for real-time or resource-constrained applications. SVM, LR, KNN and DT were effective in classification with a performance of 98%. However, SVM models demanded longer execution times, ranging from 7.670 to 37.657 seconds. KNN and DT stroked a balance between accuracy and efficiency, with execution times ranging from 2.852 to 10.941 seconds and 1.080 to 2.442 seconds, respectively. Our research underscores the importance of considering both model effectiveness and computational efficiency when selecting a Twitter spam detection model. By offering a comparative assessment of these models, this study equipped researchers with valuable insights for making informed decisions in Twitter spam detection. It highlighted the trade-offs between model performance and efficiency, paving the way for more effective and resource-conscious approaches to combating spam on social media platforms.

Downloads

Author Biographies

Stephanie Chua, Faculty of Computer Science and Information Technology, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia

chlstephanie@unimas.my

Amy Tan, Faculty of Computer Science and Information Technology, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia

65428@siswa.unimas.my

Puteri Nor Ellyza Nohuddin, Higher Colleges of Technology, Sharjah Women’s College, 79799 Abu Dhabi, United Arab Emirates

pnohuddin@hct.ac.ae

Mohd Hanafi Ahmad Hijazi, Faculty Of Computing and Informatics, Universiti Malaysia Sabah, 88400 Kota Kinabalu, Sabah, Malaysia

hanafi@ums.edu.my

Comparing the Effectiveness and Efficiency of Machine Learning Models for Spam Detection on Twitter

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

Stephanie Chua, Faculty of Computer Science and Information Technology, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia

Amy Tan, Faculty of Computer Science and Information Technology, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia

Puteri Nor Ellyza Nohuddin, Higher Colleges of Technology, Sharjah Women’s College, 79799 Abu Dhabi, United Arab Emirates

Mohd Hanafi Ahmad Hijazi, Faculty Of Computing and Informatics, Universiti Malaysia Sabah, 88400 Kota Kinabalu, Sabah, Malaysia

Downloads

Published

How to Cite

Issue

Section

Similar Articles

Most read articles by the same author(s)

araset

THE PUBLISHER

PREP

SUBMISSION

Keywords

JOURNAL METRICS AND INDEXING

DISTRIBUTION OF AUTHORS

Information

Comparing the Effectiveness and Efficiency of Machine Learning Models for Spam Detection on Twitter

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

Stephanie Chua, Faculty of Computer Science and Information Technology, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia

Amy Tan, Faculty of Computer Science and Information Technology, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia

Puteri Nor Ellyza Nohuddin, Higher Colleges of Technology, Sharjah Women’s College, 79799 Abu Dhabi, United Arab Emirates

Mohd Hanafi Ahmad Hijazi, Faculty Of Computing and Informatics, Universiti Malaysia Sabah, 88400 Kota Kinabalu, Sabah, Malaysia

Downloads

Published

How to Cite

Issue

Section

Similar Articles

Most read articles by the same author(s)

araset

THE PUBLISHER

PREP

SUBMISSION

Keywords

JOURNAL METRICS AND INDEXING

DISTRIBUTION OF AUTHORS

RELATED PUBLICATION

Information