View Article

Abstract

Diabetes mellitus is a chronic metabolic disorder where the Pre-Diabetic stage is consistently missed by binary classification models despite being the only reversible point in disease progression. This study builds a three-class prediction framework using 1,000 real clinical laboratory records from Al-Kindy Teaching Hospital and Medical City Hospital, Iraq, covering twelve biochemical features including HbA1c, lipid panels, kidney markers, and BMI. Preprocessing resolved class label inconsistencies, removed duplicates, and dropped missing rows. Fifteen features were derived from the original twelve using clinical reasoning, including TG/HDL ratio, glycaemic risk score, estimated average glucose via the ADA formula (eAG = 28.7 x HbA1c - 46.7), kidney stress index, and WHO/ADA diagnostic threshold distance features. SMOTE was applied on the training partition only after a 70/30 stratified split to correct an 8:1 class imbalance. Six classifiers were trained: Logistic Regression, Decision Tree, Random Forest, XGBoost, Explainable Boosting Machine, and a Stacking Ensemble of XGBoost, Random Forest, and EBM under a Logistic Regression meta-learner. The Hybrid Ensemble achieved the highest accuracy of 50.00% with a weighted F1-score of 0.4540, while Random Forest performed best among individual classifiers at 46.00?curacy with a cross-validation F1 mean of 0.7658. LIME global importance averaged across sixty test samples and EBM native importance scores independently identified gender-BMI interaction, Urea, HDL, and kidney stress index as the dominant predictors, with several engineered features consistently appearing in both top-ten rankings, confirming the discriminative value of clinical feature transformation.

Keywords

Diabetes Prediction, Machine Learning, Explainable AI, Ensemble Learning, XGBoost, EBM, LIME, SMOTE, Feature Engineering, Clinical Decision Support, Three-Class Classification.

Introduction

Diabetes mellitus is a chronic metabolic disorder characterized by persistently elevated blood glucose levels arising from defects in insulin secretion, insulin action, or both. According to the International Diabetes Federation, approximately 537 million adults currently live with diabetes worldwide, a figure projected to reach 643 million by 2030 and 783 million by 2045. The disease imposes a substantial global burden through complications including cardiovascular disease, chronic kidney failure, peripheral neuropathy, and diabetic retinopathy. This accelerating prevalence creates sustained pressure on healthcare systems, making early and accurate detection both a clinical and economic priority.

Conventional diagnostic approaches rely primarily on fasting blood glucose measurements and HbA1c thresholds. While effective for confirmed diabetes, these methods are limited in their capacity to identify Pre-Diabetic patients or stratify risk across heterogeneous hospital populations. Over the past decade, machine learning has demonstrated the ability to extract complex patterns from large clinical datasets and outperform traditional statistical models in disease prediction tasks.

A substantial portion of existing diabetes prediction research relies on the Pima Indians Diabetes Dataset, which contains only eight features drawn from a single demographic group. This narrow scope fails to reflect the biochemical complexity encountered in real hospital environments. Furthermore, high-performing models such as XGBoost and Random Forest are frequently criticized for their opacity, which is a critical limitation in clinical settings where physicians require justifiable, interpretable predictions before acting on model outputs.

This study addresses these gaps through a three-class prediction framework trained on 1,000 patient records from Al-Kindy Teaching Hospital and Medical City Hospital, Baghdad, Iraq. A stacking ensemble combines XGBoost, Random Forest, and the Explainable Boosting Machine under a Logistic Regression meta-learner. Fifteen features are derived from the original twelve variables using clinical reasoning covering lipid ratios, glycaemic risk, kidney function, and metabolic syndrome markers. SMOTE corrects class imbalance on the training partition only, and LIME explanations are cross-validated against EBM native importance scores to ensure transparency and clinical coherence.

LITERATURE SURVEY

Researchers have extensively experimented with machine learning techniques to recognize symptoms of diabetes at its prologue using clinical as well as non-clinical datasets.

Rathi and Madeira (2023) tested Support Vector Machines alongside Decision Trees using the Pima Indians Diabetes dataset, achieving reasonable results though constrained by limited feature diversity and binary classification scope [5]. Kaur and Kumari (2021) demonstrated that appropriate feature selection and data preprocessing significantly influence model performance when applying Logistic Regression, Random Forest, and Gradient Boosting for diabetes prediction [7]. Hassan et al. (2021) combined Naïve Bayes, Decision Trees, and Random Forest into an ensemble approach using patient health indicators, which improved classification accuracy but lacked broader clinical relevance due to the absence of detailed laboratory markers [8].

Sinha et al. (2024) investigated K-Nearest Neighbors, Random Forest, and Logistic Regression for early diabetes detection, confirming that ensemble techniques generally outperform individual classifiers, while noting persistent challenges related to model interpretability and class imbalance [2]. Kumar et al. (2023) applied ensemble learning strategies and reported strong accuracy through Random Forest, yet acknowledged that limited model transparency remains a barrier to clinical adoption [4]. Kumar Sahu and Ghosh (2022) prioritized recall optimization in supervised learning models, tuning systems to minimize false negatives in early-stage diabetes identification [3].

Alzboon et al. (2023) evaluated Logistic Regression, SVM, and Neural Networks for early diabetes diagnosis and highlighted that most existing datasets fail to capture the diversity of real-world clinical features [13]. Akmeşe (2022) compared SVM, KNN, and Decision Tree classifiers on the Pima dataset, finding that tree-based models consistently outperformed others in classification accuracy [14]. Bandhu et al. (2023) integrated multiple algorithms for diabetes forecasting and emphasized that informative datasets and model transparency are equally essential in healthcare AI applications [12]. Oliullah et al. (2024) developed a stacked ensemble approach for diabetes prediction, demonstrating that meta-learning strategies improve prediction robustness through more effective generalization [25].

Khanam and Foo (2021) conducted a systematic comparison of multiple machine learning algorithms for diabetes prediction, consistently finding that ensemble-based approaches outperformed individual classifiers across all evaluation metrics [20]. Talari et al. (2024) proposed a hybrid feature selection and classification technique for predicting Type 2 diabetes severity, confirming that a well-engineered feature space substantially improves model discrimination [22]. Modak and Jha (2024) combined multiple learning strategies into a unified prediction system and demonstrated that hyperparameter-tuned ensemble models generalize more reliably on clinical patient data [24]. Mujumdar and Vaidehi (2019) evaluated several machine learning tools on benchmark datasets, establishing foundational baselines that subsequent ensemble-based diabetes prediction studies have built upon [17].

A recurring limitation across these studies is their heavy reliance on the Pima Indians Diabetes Dataset — a narrow, eight-feature dataset drawn from a single demographic group that does not reflect the biochemical complexity of real hospital laboratory records. Additionally, the majority of existing approaches perform binary classification, overlooking the clinically significant pre-diabetic category that represents a critical early intervention window. More importantly, high-performing but opaque models such as XGBoost and Random Forest are rarely paired with robust explainability mechanisms, reducing clinician trust in model-driven decisions where transparency is a prerequisite.

The present study addresses these gaps by utilizing real laboratory data collected from Al-Kindy Teaching Hospital and Medical City Hospital, Iraq, comprising 1,000 patient records across twelve biochemical and anthropometric features. Rather than binary classification, the framework predicts three distinct health categories — Non-Diabetic, Pre-Diabetic, and Diabetic — using a stacking ensemble of XGBoost, Random Forest, and the Explainable Boosting Machine (EBM). Fifteen clinically motivated features are manually engineered from the original variables to strengthen predictive power. Model transparency is ensured through LIME explanations cross-validated against EBM native feature importance for consistency. The proposed framework is differentiated from prior work by its use of genuine clinical patient data, three-class health stratification, medically grounded feature construction, and dual-layer explainability verified side by side.

METHODOLOGY

The proposed framework follows a structured pipeline covering dataset collection, preprocessing, feature engineering, class balancing, model development, explainability, and evaluation. Figure 1 shows the complete workflow.

Fig. 1. Proposed diabetes prediction pipeline from raw clinical data to explainable three-class output.

  1. Dataset Description

The dataset was sourced from the Mendeley Data Repository and comprises 1,000 real clinical laboratory records collected from Al-Kindy Teaching Hospital and Medical City Hospital, Baghdad, Iraq. Each record contains twelve biochemical and anthropometric measurements. The target variable CLASS encodes three categories: Non-Diabetic (N), Pre-Diabetic (P), and Diabetic (Y). The dataset used in this study contains 600 Non-Diabetic, 250 Pre-Diabetic, and 150 Diabetic records, a distribution more balanced than classical benchmark datasets but still requiring careful handling during training. Table 1 lists all twelve features alongside their data types and clinical roles.

Table 1. Feature description of the Mendeley Al-Kindy clinical dataset.

Unlike the Pima Indians Diabetes Dataset, which contains eight features drawn from a single demographic group, this dataset provides continuous biochemical laboratory measurements from real hospital patients across two clinical sites, producing a more representative and challenging prediction problem.

  1. Data Preprocessing

Raw data was subjected to a structured preprocessing pipeline. The CLASS and Sex columns contained trailing whitespace characters, causing values such as 'Y ' and 'N ' to appear as distinct categories. Stripping whitespace resolved this encoding inconsistency. Duplicate records were identified and removed. Rows containing missing values were dropped entirely rather than imputed, as absent biochemical readings in clinical data are rarely random and carry diagnostic significance that synthetic values would obscure.

Sex was label-encoded as Male=1 and Female=0. CLASS was ordinally encoded as Non-Diabetic=0, Pre-Diabetic=1, and Diabetic=2. StandardScaler was applied to all numerical columns, transforming each feature to zero mean and unit variance. Critically, the scaler was fitted exclusively on the raw training partition before SMOTE was applied, ensuring that the learned mean and standard deviation reflect real patient data distributions rather than synthetic oversampled points. The scaled training data was then passed to SMOTE for minority class synthesis.

  1. Feature Engineering

The twelve original features were extended to twenty-seven by deriving fifteen additional variables through domain-informed clinical transformations. Raw laboratory values alone do not capture the relational patterns that clinicians use in practice. A physician does not assess Triglycerides in isolation but in ratio with HDL, because the relationship between the two carries diagnostic meaning that neither value conveys independently. The fifteen features were organized into five groups based on the biological system they measure.

3.1 Lipid Ratio Features

Four lipid ratios were computed to capture dyslipidemia patterns associated with metabolic syndrome and insulin resistance:

Cholesterol-to-HDL ratio, LDL-to-HDL ratio, VLDL-to-HDL ratio, and Triglyceride-to-HDL ratio.

A TG/HDL ratio above 3.0 is a validated clinical surrogate for insulin resistance [McLaughlin et al., 2005]. This threshold is used as the basis for the metabolic syndrome score in Section 3.4.

3.2 Glycaemic Features

The Mendeley dataset contains no raw fasting blood sugar column. Estimated average glucose was derived using the ADA validated conversion formula:

This formula was validated by the American Diabetes Association and converts HbA1c percentage directly to an estimated average glucose value [ADA, 2008]. Two additional features were derived from this: the HbA1c-to-blood-sugar ratio capturing discordance between chronic and short-term glucose regulation, and a glycaemic risk score computed as:

Glyceamic Risk Score=HbA1c ×eAG

This formula was validated by the American Diabetes Association and converts HbA1c percentage directly to an estimated average glucose value [ADA, 2008]. Two additional features were derived from this: the HbA1c-to-blood-sugar ratio capturing discordance between chronic and short-term glucose regulation, and a glycaemic risk score computed as:

Kidney Stress Index=Urea ×Creatinine

3.4 Metabolic Syndrome Features

Three features approximate the International Diabetes Federation metabolic syndrome criteria. A BMI-age interaction term was included because obesity-related diabetes risk compounds with age, meaning neither variable captures the combined effect independently. The Lipid Accumulation Product was computed as:

LAP=(BMI-25)×Triglycerides

LAP is a validated clinical index of visceral fat accumulation [Kahn, 2005]. A composite metabolic syndrome score integrating TG/HDL threshold, BMI threshold, and age was also derived.

3.5 Clinical Threshold Distance Features

Three features measure the continuous distance of each patient from established diagnostic cutoffs. HbA1c distance from the WHO diabetes threshold of 6.5%, estimated blood sugar distance from the ADA fasting glucose threshold of 126 mg/dL, and a gender-BMI interaction term capturing sex-specific obesity risk.

HbA1c Distance=HbA1c-6.5

BSL Distance=eAG-126

These distance features give the model direct information about how close each patient sits to a clinical diagnosis boundary, rather than requiring the model to infer this from raw values alone.

  1. Handling Class Imbalance

The dataset contains 840 Diabetic, 102 Non-Diabetic, and 53 Pre-Diabetic records, corresponding to approximately 84%, 10%, and 5% of the total respectively. Training directly on this distribution produces a model that defaults toward the majority class. Overall accuracy appears acceptable while recall for Non-Diabetic and Pre-Diabetic patients collapses, which is clinically the more dangerous failure mode.

Fig. 2. Class distribution before and after SMOTE   balancing.

The Synthetic Minority Oversampling Technique was applied exclusively on the training partition after splitting. SMOTE generates synthetic minority samples by selecting a minority instance and interpolating between it and one of its k nearest neighbors in feature space:

Where xi is the selected minority instance, xneighbor is one of its k=5 nearest neighbors, and lambda is a random value drawn from the uniform distribution between 0 and 1. Applying SMOTE strictly after splitting ensures no synthetic samples derived from real records appear in the test set, preserving the integrity of all reported evaluation metrics.

  1. Model Development

The dataset was divided into training and testing partitions using a 70/30 stratified split, preserving class proportions in both sets. Six classifiers were trained and evaluated.

5.1 Logistic Regression

Logistic Regression estimates class probability using the softmax extension for multiclass problems. For the binary base case the probability is:

The model was trained with maxiter=2000 and balanced class weights. It additionally serves as the meta-learner in the stacking ensemble due to its interpretable coefficient structure and stable convergence.

5.2 Decision Tree

A Decision Tree partitions the feature space by selecting the split that maximizes information gain at each node:

Where pi is the proportion of samples belonging to class i and Sv is the subset where feature A takes value v. The model was trained with max_depth=10 and balanced class weights.

5.3 Random Forest

Random Forest aggregates predictions from k independently trained Decision Trees, each built on a random bootstrap sample with a random feature subset:

Where hi(x) is the prediction of the ith tree. GridSearchCV with 5-fold stratified cross-validation was applied with weighted F1-score as the optimization metric. Search space: n_estimators in {100, 200}, max_depth in {None, 10}, min_samples_split in {2, 5}.

5.4 XGBoost

XGBoost builds trees sequentially, with each tree correcting the residual errors of the previous iteration. The objective function at step t is:

Where l is the loss function, y_hat_i(t-1) is the prediction from the previous iteration, ft is the new tree being added, and Ωft is the regularization term controlling model complexity. GridSearchCV with 5-fold CV was applied. Search space: n_estimators in {100, 200}, max_depth in {3, 5}, learning_rate in {0.05, 0.1}, subsample in {0.8, 1.0}.

5.5 Explainable Boosting Machine

EBM is a glass-box model that learns the contribution of each feature independently through cyclic gradient boosting, then adds selected pairwise interaction terms:

Where fi(xi) is the learned contribution of feature i and fij(xi, xj) captures the interaction between features i and j. Unlike XGBoost and Random Forest, EBM produces a visual graph of each feature's contribution across its value range, making every prediction directly inspectable without any external explanation tool. Default parameters were used.

5.6 Hybrid Stacking Ensemble

A stacking ensemble was constructed using XGBoost, Random Forest, and EBM as base models under a Logistic Regression meta-learner. During training, each base model generates out-of-fold predictions on the training data through 5-fold cross-validation. These predictions form a new feature matrix which is used to train the meta-learner:

Fig. 3. Hybrid stacking ensemble architecture.

During inference, the three base models each produce predictions on the test set. These are passed to the trained meta-learner which outputs the final class. This architecture allows the meta-learner to learn which base model to weight more heavily and in what regions of the feature space.

  1. Explainability

Clinical deployment of any ML model requires more than high accuracy. A physician presented with a Pre-Diabetic prediction needs to know which features drove it before acting on it. Two complementary methods were applied.

6.1 LIME

LIME was chosen over SHAP because our final model is a stacking ensemble combining three different model architectures. SHAP requires a separate implementation for each model type, meaning three different explainers would need to be reconciled to explain one ensemble prediction. LIME treats the entire ensemble as a black box and explains the final output directly with one unified method.

For each explanation, LIME samples perturbed versions of the input instance, obtains predictions from the model for each perturbation, and fits a locally weighted linear model to those input-output pairs:

explanationx= argming∈GL(f,g,πx)+ Ω(g)

Where f is the original model, g is the interpretable local linear model, pi_x is the proximity weighting around instance x, and Omega(g) is the complexity of the explanation. The linear coefficients of g are read as feature contributions for that prediction.

Local explanations were generated for three cases: a correctly classified Diabetic patient, a correctly classified Pre-Diabetic patient, and one misclassified case. Global importance was estimated by averaging absolute LIME weights across sixty randomly sampled test instances.

6.2 EBM Native Importance

EBM produces its own feature importance scores natively as part of its architecture, without requiring any external tool. These scores were extracted and ranked alongside the LIME global weights.

  1. Evaluation Metrics

All metrics were computed as weighted averages across three classes, where each class contributes proportionally to its support size in the test set. This prevents the 840-record Diabetic majority from inflating overall scores.

7.1 Accuracy

Measures the overall proportion of correct predictions. Weighted averaging is applied because raw accuracy on an imbalanced test set would be dominated by the majority class.

7.2 Precision

Measures the proportion of predicted positives that were correct. Low precision means the model raises false alarms, generating unnecessary clinical follow-up.

7.3 Recall

Measures the proportion of actual positives that were correctly identified. In medical diagnosis, low recall is the more dangerous failure: a missed diabetic patient receives no intervention.

7.4 F1-Score

The harmonic mean of precision and recall. Used as the primary optimization metric for GridSearchCV because the test set retains the original class imbalance after SMOTE was applied only to training data.

7.5 ROC-AUC

ROC-AUC measures the probability that the model ranks a randomly chosen positive instance above a randomly chosen negative instance. A One-vs-Rest strategy was used to extend this to three classes, producing one ROC curve per class. An AUC of 1.0 indicates perfect class separation.

Five-fold cross-validation F1-score with mean and standard deviation was additionally reported for Logistic Regression, Decision Tree, Random Forest, and XGBoost to assess generalization stability across different training subsets. Learning curves tracking training and validation F1-score across eight training set sizes were generated for XGBoost and Random Forest.

RESULT AND DISCUSSION

All models were evaluated on a held-out test set of 300 samples comprising 180 Non-Diabetic, 75 Pre-Diabetic, and 45 Diabetic patients. Weighted averages were applied across all metrics to account for the unequal class distribution retained in the test set.

  1. Model Performance Results

Table 2 presents the overall performance of all six classifiers.

Table 2. Performance comparison of all six classifiers on the 300-sample held-out test set.

Fig. 4. Comparative bar chart of Accuracy, F1-Score, and ROC-AUC across all six models.

The Hybrid Stacking Ensemble achieved the highest accuracy of 50.00% with a weighted F1-score of 0.4540 and ROC-AUC of 0.4918, outperforming all individual classifiers across all three primary metrics. Random Forest delivered the best individual model performance at 46.00% accuracy, a weighted F1-score of 0.4627, and the highest cross-validation F1 mean of 0.7658 (±0.0176), indicating strong fold-to-fold consistency on the training distribution. XGBoost and EBM produced comparable mid-range results at 42.00% and 41.33% accuracy respectively. Logistic Regression and Decision Tree recorded the weakest performance at 36.67% and 32.67% accuracy, reflecting the limitations of linear boundaries and shallow tree splits in capturing multi-class clinical patterns.

  1. Per-Class Classification Results

Table 3 presents precision, recall, and F1-score broken down by class. Overall accuracy alone does not reveal how each model handles the minority classes, which are the clinically critical groups.

Table 3. Per-class precision, recall, and F1-score for all six models on the held-out test set.

Fig. 5. Confusion matrices for all six models on the 300-sample test set.

The Hybrid Ensemble achieved the highest Non-Diabetic F1 of 0.67 with a strong recall of 0.77, demonstrating its ability to correctly identify the majority class. However, Pre-Diabetic recall dropped to 0.12 and Diabetic recall to 0.07, indicating the ensemble's tendency to favour the dominant class under the current dataset distribution. Random Forest achieved the most balanced per-class performance among individual models, with Non-Diabetic F1 of 0.61 and Pre-Diabetic F1 of 0.32. EBM recorded the highest Diabetic recall at 0.33, suggesting stronger sensitivity to the minority Diabetic class than other models.

The consistently low Diabetic class F1 scores across all models reflect an underlying challenge in the dataset: the feature distributions across the three classes show minimal separation, making class boundaries difficult for any classifier to learn reliably. This is reflected in the ROC-AUC values hovering near 0.49–0.50 for most models, approaching random-chance discrimination.

  1. ROC Curve Analysis

Fig. 6. One-vs-Rest ROC curves for all six models across Non-Diabetic, Pre-Diabetic, and Diabetic classes.

One-vs-Rest ROC curves were computed for each model across all three classes. The Decision Tree achieved the highest ROC-AUC of 0.5038, with Logistic Regression at 0.4738 and the Hybrid Ensemble at 0.4918. ROC-AUC values close to 0.50 indicate limited probabilistic discrimination between classes, consistent with the per-class F1 findings. The separation between models is most pronounced on the Non-Diabetic One-vs-Rest curve, where Random Forest and the Hybrid Ensemble exhibit slightly wider margins from the diagonal baseline.

  1. Explainability Results

Two complementary explainability analyses were conducted on the stacking ensemble: LIME global importance and EBM native feature importance cross-validation.

4.1 LIME Global Feature Importance

Fig. 7. Global feature importance derived by averaging absolute LIME weights across 60 test samples. Red bars indicate engineered features. Blue bars indicate original dataset features.

LIME global importance averaged across sixty test samples identified gender_bmi_interaction (0.0403), Urea (0.0297), HDL (0.0284), BMI (0.0260), Weight (0.0259), LDL (0.0248), Creatinine (0.0241), kidney_stress_index (0.0215), HbA1c (0.0200), and metabolic_syndrome_score (0.0167) as the ten strongest predictors. Notably, gender_bmi_interaction, kidney_stress_index, and metabolic_syndrome_score are all engineered features, confirming that clinical transformations of raw values contribute meaningfully to the model’s decision-making. The kidney stress index ranking eighth highlights the importance of combined renal markers in the classification task.

4.2 EBM Native Feature Importance

Fig. 8. EBM native feature importance. Red bars indicate engineered features. Green bars indicate original dataset features.

EBM native importance scores ranked Creatinine (0.1689), Urea (0.1562), metabolic_syndrome_score (0.1360), LDL (0.1360), bmi_age_interaction (0.1353), ldl_hdl_ratio (0.1345), BMI (0.1259), HDL (0.1155), lipid_accumulation_product (0.1115), and chol_hdl_ratio (0.1092) in the top ten. Four of these ten are engineered features, validating that the clinical feature construction step adds discriminative signal beyond the raw measurements.

4.3 LIME vs EBM Cross-Validation

Both LIME and EBM consistently placed Urea, HDL, BMI, Creatinine, and kidney-related engineered features in their respective top rankings. LIME derives importance by perturbing inputs and fitting local linear models. EBM derives importance from its internal additive architecture during training. These are entirely different mechanisms. Their agreement on overlapping top features — particularly Urea, HDL, and the kidney stress index — strengthens confidence that these rankings reflect genuine predictive signal in the data rather than artifacts of either method.

Fig. 9. LIME global importance alongside EBM native feature importance. Overlapping top features confirm rankings are not artifacts of either explanation approach.

  1. Comparison with Prior Work

Table 4 compares the proposed framework against six studies from existing literature.

Table 4. Comparison of proposed framework against prior diabetes prediction studies.

Three characteristics differentiate the proposed framework from all studies in Table 4. First, every prior study performs binary classification. This framework predicts three distinct classes including Pre-Diabetic, the only reversible stage of diabetes progression. Second, no prior study pairs two independent explainability methods. LIME and EBM were applied separately and their rankings compared, providing dual-method confirmation of feature relevance. Third, this framework operates on a clinically collected dataset with thirteen features including Weight, rather than simplified eight-feature benchmark datasets used by the majority of compared studies.

Reference

  1. Olusogo, A., Olusola, A.G., & Ibrahim, F.O. (2021). Early Diabetic Risk Prediction using Machine Learning Classification Techniques.
  2. Sinha, R., Vennela, B.S., & Babu, S. (2024). Early Diabetes Prediction using Machine Learning Algorithms. 2024 3rd International Conference on Applied Artificial Intelligence and Computing (ICAAIC), 705-708.
  3. Kumar Sahu, B., & Ghosh, N. (2022). Early Stage Prediction of Diabetes Using Machine Learning Techniques. Lecture Notes in Networks and Systems.
  4. Kumar, R., Gupta, M., Tamak, N., & Thakare, S. (2023). Predictive Modeling for Early Detection of Diabetes Using Machine Learning Approach. 2023 International Conference on Advances in Computation, Communication and Information Technology (ICAICCIT), 100-105.
  5. Rathi, B., & Madeira, F. (2023). Early Prediction of Diabetes Using Machine Learning Techniques. 2023 Global Conference on Wireless and Optical Technologies (GCWOT), 1-7.
  6. SRI SANTHI, S.V., Sundar, L., S. V. AYYAPPA, R., & K. LAKSHMI, V. (2021). PREDICTION OF DIABETES USING MACHINE LEARNING. i-manager's Journal on Information Technology.
  7. Kaur, H., & Kumari, V. (2021). Predictive modelling and analytics for diabetes using a machine learning approach.
  8. Hassan, M.M., Billah, M.A., Rahman, M.M., Zaman, S., Shakil, M.M., & Angon, J.H. (2021). Early Predictive Analytics in Healthcare for Diabetes Prediction Using Machine Learning Approach. 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), 01-05.
  9. Bhat, S. S., Selvam, V., & Ansari, G. A. (2023). Predicting Lifestyle of Early Diabetes Mellitus Using Machine Learning Technique. International Journal of Computing, 22(3), 345-351.
  10. Kalla, D., Smith, N., Samaah, F., & Polimetla, K. (2022). Enhancing Early Diagnosis: Machine Learning Applications in Diabetes Prediction. Journal of Artificial Intelligence & Cloud Computing
  11. Bandhu, K. C., Litoriya, R., Rathore, A., Safdari, A., Watt, A., Vaidya, S., & Khan, M. (2023). Integrating Machine Learning for Accurate Prediction of Early Diabetes: A Novel Approach. International Journal of Cyber Behavior, Psychology and Learning
  12. Alzboon, M., Al-Batah, M., Alqaraleh, M., Abuashour, A., & Bader, A. F. (2023). Early Diagnosis of Diabetes: A Comparison of Machine Learning Methods. International Journal of Online and Biomedical Engineering (iJOE)
  13. Akmeşe, Ö. F. (2022). Diagnosing Diabetes with Machine Learning Techniques. Hittite Journal of Science and Engineering, 9(1), 09-18.
  14. Flores, L., Hernandez, R., Macatangay, L. H., Garcia, S. M. G., & Melo, J. R. (2023). Comparative Analysis in the Prediction of Early-Stage Diabetes Using Multiple Machine Learning Techniques. Indonesian Journal of Electrical Engineering and Computer Science
  15. Soni, M., & Varma, S. (2020). Diabetes Prediction using Machine Learning Techniques. International Journal of Engineering Research & Technology (IJERT), 9(09).
  16. Ahmed, N., Ahammed, R., Islam, M. M., Uddin, M. A., Akhter, A., Talukder, M. A., & Paul, B. K. (2021). Machine learning based diabetes prediction and development of smart web application. International Journal of Cognitive Computing in Engineering, 2, 229-241.
  17. Mujumdar, A., & Vaidehi, V. (2019). Diabetes Prediction Using Machine Learning Algorithms. Procedia Computer Science, 165, 292–299
  18. Sangani, M., Katakam, V., Merugu, J., Madugula, B., & Jitty, R. (2023). DIABETES PREDICTION USING MACHINE LEARNING. International Journal of Novel Research and Development, 8(6), e501–e509.
  19. El Massari, H., Gherabi, N., Qanouni, F., & Mhammedi, S. (2024). Diabetes Prediction Using Machine Learning with Feature Engineering and Hyperparameter Tuning. International Journal of Advanced Computer Science and Applications, 15(8), 171-178.
  20. J. J. Khanam and S. Y. Foo, “A comparison of machine learning algorithms for diabetes prediction,” ICT Express, vol. 7, no. 4, pp. 432– 439, Dec. 2021, doi: 10.1016/j.icte.2021.02.004.
  21. M. S. Alzboon, M. S. Al-Batah, M. Alqaraleh, A. Abuashour, and A. F. H. Bader, “Early Diagnosis of Diabetes: A Comparison of Machine Learning Methods,” International Journal of Online and Biomedical Engineering (iJOE), vol. 19, no. 15, Art. no. 15, Oct. 2023, doi: 10.3991/ijoe.v19i15.42417.
  22. P. Talari et al., “Hybrid feature selection and classification technique for early prediction and severity of diabetes type 2,” PLOS ONE, vol. 19, no. 1, p. e0292100, Jan. 2024, doi: 10.1371/journal.pone.0292100.
  23. P. V and R. D. R, “A Hybrid Model for Prediction of Diabetes Using Machine Learning Classification Algorithms and Random Projection,” Jun. 28, 2023. doi: 10.21203/rs.3.rs-3081331/v1.
  24. S. K. S. Modak and V. K. Jha, “Diabetes prediction model using machine learning techniques,” Multimed Tools Appl, vol. 83, no. 13, pp. 38523–38549, Apr. 2024, doi: 10.1007/s11042-023-16745-4.
  25. K. Oliullah, M. H. Rasel, Md. M. Islam, Md. R. Islam, Md. A. H. Wadud, and Md. Whaiduzzaman, “A stacked ensemble machine learning approach for the prediction of diabetes,” J Diabetes Metab Disord, vol. 23, no. 1, pp. 603–617, Jun. 2024, doi: 10.1007/s40200- 023-01321-2.

Photo
Ananya Thakur
Corresponding author

Dept of CSE, KPR Institute of Engineering and Technology

Photo
Theertha
Co-author

Dept of AI&DS, KPR Institute of Engineering and Technology

Photo
M. Saravanan
Co-author

Dept of AI&DS, KPR Institute of Engineering and Technology

M. Saravanan, Theertha, Ananya Thakur*, Predicting Diabetes Through Explainable Hybrid ML Models For Healthcare Decision Support, Int. J. Sci. R. Tech., 2026, 3 (4), 1020-1031. https://doi.org/10.5281/zenodo.19808891

More related articles
Smart Bridge Infrastructure: Automatic Height Adju...
Kajal Sahu, Sunita Dhruw, Reeturaj Khapre, Prince Yadav, Utkarsh ...
Formulation and Evaluation of Herbal Balm Using Cl...
Aniket Yedke, Bhagyashali Pawar, ...
Overview Of In Vitro – Antioxidant Models...
Vishal Shewale , Shubham Pawar, Aakanksha Shewale , Nikita Sandhan , Priti Patle, Vaidehi Pawar , ...
Related Articles
A Review on Novel Approaches for Cure, Diagnosis, Treatment and Future Direction...
Ankita Damahe, Khilendra Kumar Sahu, Antra Sahu, Chunesh kumar, Devki Markande, Nilesh kumar, Janvi ...
Formulation and Evaluation of Fluconazole Loaded Ethosomes for Topical Drug Deli...
Suvarna Sangale, Vaishnavi Dhage, Snehal Gondkar, Tanvir Pathan, Prerana Jamdhade, Gursal Kanchan, ...
Formulation and Development of Carbopol -Based Treatment for Mouth Ulcer...
Khemraj Patel, Anjali Sahu, Vikram Singh, Priyansh Gupta, ...
Natural Gums In Topical Drug Delivery Systems: A Comprehensive Review Of Formula...
Chetan Patil, Rohini Patil, Nakshata Chavan, Krutika Salave, ...
Smart Bridge Infrastructure: Automatic Height Adjustment for Flood Resilience an...
Kajal Sahu, Sunita Dhruw, Reeturaj Khapre, Prince Yadav, Utkarsh Yadav, Deepti Hazari, ...