Performance Evaluation Criteria for High Dimensional Classification Problems

Authors

  • Nor Aishah Ahad Institute of Strategic Industrial Decision Modeling, School of Quantitative Sciences, College of Arts and Sciences, Universiti Utara Malaysia, 06010 Sintok, Kedah, Malaysia
  • Friday Zinzendoff Okwonu Department of Mathematics, Faculty of Science, Delta State University, P.M.B.1, Abraka, Nigeria
  • Yik Siong Pang School of Quantitative Sciences, College of Arts and Sciences, Universiti Utara Malaysia, 06010 Sintok, Kedah, Malaysia
  • Shuhairy Norhisham Department of Civil Engineering, College of Engineering, College of Engineering, Universiti Tenaga Nasional, 43000 Kajang, Selangor, Malaysia
  • Muhammad Fadhlullah Abu Bakar Department of Civil Engineering, College of Engineering, College of Engineering, Universiti Tenaga Nasional, 43000 Kajang, Selangor, Malaysia

DOI:

https://doi.org/10.37934/sijfam.4.1.6174

Keywords:

Fisher linear classification, Independent classification rule, Misclassification rate, Principal component analysis, Variable selection

Abstract

In high dimensional small sample (HDSS) classification problems, the issue of relevant and irrelevant data, the curse of singularity, and dimensionality persist. The presence of irrelevant variables has generated different problems in the classification domain such as computational time, misclassification rate, and performance evaluation criteria. The covariance-dependent classification methods such as the Fisher linear classification method (FLCM) are redundant as such, the independent classification rule (ICR) was coined to solve these problems. Yet, the training and validation of the ICR learned model depends on the relevant and irrelevant data in the variables. To overcome these problems, we applied the principal component analysis (PCA) for dimension reduction on the FLCM (PCA- FLCM), the ICR method (PCA-ICR), F-weighted PCA called W-PCA, and the proposed benchmark extraction method (BEM) to tackle the above mentioned HDSS classification problems. For this study, we investigated the number and percentage of relevant variables selected, computational time, and the probability of correct classification (PCC). To evaluate the performance of these methods, we applied the performance evaluation criteria (PEC) to analyse the probability of correct classification for HDSS classification problems based on the axioms of the probability concept. The results revealed that the W-PCA procedure is very sensitive to select the most vital few variables (Minimum number of vital variables) followed by the BEM procedure. The W-PCA variants have the best computational time while the BEM has the overall best PCC for the data set investigated. The findings demonstrated that the BEM approach outperformed other methods in terms of probability of correct classification while the W-PCA has the best optimal variable search and selection capabilities than the other methods.

Downloads

Download data is not yet available.

Author Biographies

Nor Aishah Ahad, Institute of Strategic Industrial Decision Modeling, School of Quantitative Sciences, College of Arts and Sciences, Universiti Utara Malaysia, 06010 Sintok, Kedah, Malaysia

aishah@uum.edu.my

Friday Zinzendoff Okwonu, Department of Mathematics, Faculty of Science, Delta State University, P.M.B.1, Abraka, Nigeria

okwonufz@delsu.edu.ng

Yik Siong Pang, School of Quantitative Sciences, College of Arts and Sciences, Universiti Utara Malaysia, 06010 Sintok, Kedah, Malaysia

ys_pang@ahsgs.uum.edu.my

Shuhairy Norhisham, Department of Civil Engineering, College of Engineering, College of Engineering, Universiti Tenaga Nasional, 43000 Kajang, Selangor, Malaysia

shuhairy@uniten.edu.my

Muhammad Fadhlullah Abu Bakar, Department of Civil Engineering, College of Engineering, College of Engineering, Universiti Tenaga Nasional, 43000 Kajang, Selangor, Malaysia

feddy91@gmail.com

Published

2024-12-15

How to Cite

Ahad, N. A., Okwonu, F. Z. ., Pang, Y. S. ., Norhisham, S. ., & Abu Bakar, M. F. . (2024). Performance Evaluation Criteria for High Dimensional Classification Problems. Semarak International Journal of Fundamental and Applied Mathematics, 4(1), 61–74. https://doi.org/10.37934/sijfam.4.1.6174

Issue

Section

Articles