The Best Machine Learning Model  for  Fraud Detection In Banking Sector:  A Systematic Literature Review

Yanto Yanto; Lisah Lisah; Re’gina Tandra

doi:10.32877/eb.v7i2.1474

Authors

^{(
1
)}

^{(
2
)}

^{(
3
)}

DOI:

https://doi.org/10.32877/eb.v7i2.1474

Keywords:

machine learning, fraud detection, banking, systematic literature review

Abstract

In today's financial landscape, banks face an increasing number of sophisticated fraud attempts, driven by advancements in technology and the growing volume of digital transactions, especially accelerated by the COVID-19 pandemic. This systematic literature review (SLR) aims to identify the most effective machine learning models for fraud detection in the banking sector, based on open access articles published between 2014 and 2024. Data was collected using Dimensions.ai, focusing on keywords such as "machine learning," "fraud detection," and "banking." Various models, including Naive Bayes, Logistic Regression, XGBoost, Autoencoder, K-Nearest Neighbors (KNN), Light GBM, Random Forest, Autoencoder, Support Vector Machine (SVM), Bidirectional LSTM, and Generative Adversarial Network (GAN), were reviewed for their effectiveness in detecting banking fraud. The review highlights that models like XGBoost, Light GBM, and Random Forest consistently demonstrate high accuracy in detecting various types of fraud within the banking sector, with Autoencoder-XGB-SMOTE-CGAN achieving the highest accuracy of up to 98%. Hybrid models such as Autoencoder-XGB-SMOTE-CGAN show superior performance, achieving very high accuracy and top scores in metrics like MCC, TNR, and ACC. The most common types of fraud detected include credit card fraud, identity theft, and account takeover. Techniques for data balancing and preprocessing, such as SMOTE and BMR, significantly enhance the accuracy of fraud detection models. The findings suggest that while traditional models like Naive Bayes and Logistic Regression remain effective, advanced models like XGBoost and hybrid approaches provide the highest accuracy in fraud detection. Future research should explore the development of hybrid models and the application of real-time data to further improve fraud detection systems in the banking sector, ensuring they can adapt to the rapidly evolving threat landscape.