An Empirical Study on the Effectiveness and Efficiency of Machine Learning Classifiers for Liver Disease Prediction

Mohamed Amine  NEMMICH; Asmaa  BOUDALI; Noureddine  BOUKHARI; Fatima  DEBBAT

doi:10.58681/ajrt.25090106

Authors

Mohamed Amine NEMMICH Department of Computer Science, Mathematics Laboratory, Djillaliliabes University of SidiBel Abbes, SidiBel Abbes, Algeria
Asmaa BOUDALI Department of Electronics, Laboratory of coding and security of information, Sciences and Technology University of Oran Mohamed Boudiaf, Oran, Algeria
Noureddine BOUKHARI Department of Mathematics, Djillaliliabes University of SidiBel Abbes, SidiBel Abbes, Algeria
Fatima DEBBAT Department of Computer Science, Mustapha Stambouli University of Mascara, Mascara, Algeria

DOI:

https://doi.org/10.58681/ajrt.25090106

Keywords:

Liver Disease Prediction, Machine Learning Classification, Class Imbalance, Hyperparameter Tuning, Ensemble Methods

Abstract

Liver disease poses a significant global health burden, with high mortality rates exacerbated by challenges in early detection. Machine learning (ML) offers promising avenues for developing automated diagnostic tools to address this critical need. While various ML classifiers have been explored for liver disease prediction, a comprehensive, systematic comparison of a wide range of modern algorithms, incorporating robust pre-processing, handling of class imbalance, hyper parameter tuning with cross-validation, and analysis of computational efficiency, is essential to guide the selection of models for practical application. This study systematically evaluates thirteen diverse ML classification algorithms using the Liver Patient Dataset (LDPD). The methodology includes data pre-processing with imputation, encoding, and standardization within a pipeline to prevent data leakage, handling class imbalance using SMOTE, splitting data into training and testing sets, and employing RandomizedSearchCV with Stratified K-Fold cross-validation for hyper parameter optimization. Performance was assessed using key metrics including Accuracy, Precision, Recall, Specificity, F1-Score, and ROC AUC on an independent test set, alongside training time. Results demonstrate that ensemble and advanced tree-based methods achieve superior predictive performance. Hyper parametertuning further optimized performance, with Tuned Random Forest achieving the highest ROC AUC (0.9995) and Specificity (0.9973), and Tuned LightGBM achieving the highest Recall (0.9996). The study highlights a crucial trade-off: while tuning yields peak performance, default configurations of efficient models like LightGBM and XGBoost offer exceptionally high performance (ROC AUC ≥ 0.9993) combined with significantly faster training times (≤ 0.41 seconds), providing a favorable balance for practical application. This research identifies highly effective and efficient ML models for liver disease prediction, contributing empirical evidence to support the development of automated diagnostic aids.

An Empirical Study on the Effectiveness and Efficiency of Machine Learning Classifiers for Liver Disease Prediction

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Categories

Similar Articles

Information

Make a Submission

Current Issue

Browse