Ransomware Early Detection using Machine Learning Approach and Pre-Encryption Boundary Identification
DOI:
https://doi.org/10.37934/araset.47.2.121137Keywords:
Ransomware, Early detection, Pre-encryption, Pre-encryption boundary, Crypto-ransomware, Cryptographic ransomwareAbstract
The escalating ransomware threat has catalysed the formation of a sophisticated network of cybercriminal enterprises. Addressing this issue, our research provides a detailed exploration of the ransomware menace and an evaluation of contemporary detection methodologies. A successful ransomware attack leverages many factors: robust encryption methods that defy decryption, the anonymity of cyber currencies, and the widespread availability of ransomware kits that enable even inexperienced actors to launch attacks. Such dynamics have cultivated a niche for cybercriminal specialists in the digital underworld. In response to these challenges, our study proposes a detection framework based on machine learning, a domain where regression algorithms have gained popularity without yielding a definitive protective model. We employ API call analysis as the foundation to assess various machine learning classifiers' efficiency in identifying ransomware. The evaluation demonstrates that the Naive Bayes classifier underperforms due to suboptimal accuracy, making it unsuitable for this application. Conversely, Logistic Regression, with an AUC of 0.951, minimal training time, and substantial efficacy gains, emerges as a strong contender. The Decision Tree and Random Forest classifiers exhibit comparable proficiency; however, the Decision Tree's interpretability and Random Forest's computational swiftness present unique advantages. Superior still, SVM and Gradient Boosted Trees command the highest AUC and gains, albeit at the cost of increased training duration. Our findings affirm the pivotal role of API call analysis in ransomware detection and the potency of machine learning approaches in learning from extensive datasets to identify novel malware strains. Given the continual evolution of malware, detection methodologies must adapt correspondingly. This study's comparative analysis elucidates the trade-offs between accuracy, computational speed, and training time, guiding the selection of the optimal machine learning algorithm for robust ransomware detection.