Objectives: Reliable, accessible, noninvasive self-assessment screening for prediabetes/diabetes is lacking, leading to missed opportunities for early intervention. We aimed to develop and externally validate a machine learning (ML)-derived self-assessment system to predict the likelihood of prevalent prediabetes or diabetes using easily accessible health parameters. Study Design and Setting: We analyzed 30 years (1988-2018) of National Health and Nutrition Examination Survey (NHANES) data (N= 17,458). ML models predicting prediabetes/diabetes risk (composite outcome: fasting plasma glucose >= 100 mg/dL or HbA1c >= 5.7% [>= 39 mmol/mol]) were developed using multimodal data. The Boruta algorithm identified key predictors. Multiple ML models were compared; the best performer (neural network) formed the Machineborne Early Diabetic Warning And Control System (MEDWACS). A final set of seven easily accessible parameters suitable for self-assessment was selected. Performance was assessed via receiver operating characteristic curve (ROCAUC) and calibration. External validation used NHANES 2021-2023 (N= 3043) and Korea NHANES 2023 (N = 5492). Results: The final seven-parameter MEDWACS model included age, waist circumference, systolic blood pressure, gender, upper leg length, arm circumference, and body mass index. Internally, MEDWACS achieved ROCAUC 0.804 (95% confidence interval, 0.792-0.816) with robust subpopulation performance. External validation confirmed strong performance and generalizability (ROCAUCs: US 0.773 [0.756-0.790], Korea 0.780 [0.768-0.792]) and good calibration. Interpretability analysis identified key drivers. Decision curve analysis showed that MEDWACS had superior clinical utility compared to the established screening guidelines. An online tool was developed to facilitate home-based self-assessment and clinical use. Conclusion: MEDWACS provides a validated, noninvasive ML risk stratification tool using seven accessible parameters to identify individuals likely having prevalent prediabetes or diabetes. It can aid in prompting timely clinical evaluations, potentially reducing the public health burden. (c) 2026 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://
Enhancing prediabetes and diabetes detection through a machine learning-enabled self-assessment approach / Yoo, D., Maggiore, U., Jolliet, O.. - In: JOURNAL OF CLINICAL EPIDEMIOLOGY. - ISSN 0895-4356. - 195:(2026). [10.1016/j.jclinepi.2026.112266]
Enhancing prediabetes and diabetes detection through a machine learning-enabled self-assessment approach
Maggiore U.;
2026-01-01
Abstract
Objectives: Reliable, accessible, noninvasive self-assessment screening for prediabetes/diabetes is lacking, leading to missed opportunities for early intervention. We aimed to develop and externally validate a machine learning (ML)-derived self-assessment system to predict the likelihood of prevalent prediabetes or diabetes using easily accessible health parameters. Study Design and Setting: We analyzed 30 years (1988-2018) of National Health and Nutrition Examination Survey (NHANES) data (N= 17,458). ML models predicting prediabetes/diabetes risk (composite outcome: fasting plasma glucose >= 100 mg/dL or HbA1c >= 5.7% [>= 39 mmol/mol]) were developed using multimodal data. The Boruta algorithm identified key predictors. Multiple ML models were compared; the best performer (neural network) formed the Machineborne Early Diabetic Warning And Control System (MEDWACS). A final set of seven easily accessible parameters suitable for self-assessment was selected. Performance was assessed via receiver operating characteristic curve (ROCAUC) and calibration. External validation used NHANES 2021-2023 (N= 3043) and Korea NHANES 2023 (N = 5492). Results: The final seven-parameter MEDWACS model included age, waist circumference, systolic blood pressure, gender, upper leg length, arm circumference, and body mass index. Internally, MEDWACS achieved ROCAUC 0.804 (95% confidence interval, 0.792-0.816) with robust subpopulation performance. External validation confirmed strong performance and generalizability (ROCAUCs: US 0.773 [0.756-0.790], Korea 0.780 [0.768-0.792]) and good calibration. Interpretability analysis identified key drivers. Decision curve analysis showed that MEDWACS had superior clinical utility compared to the established screening guidelines. An online tool was developed to facilitate home-based self-assessment and clinical use. Conclusion: MEDWACS provides a validated, noninvasive ML risk stratification tool using seven accessible parameters to identify individuals likely having prevalent prediabetes or diabetes. It can aid in prompting timely clinical evaluations, potentially reducing the public health burden. (c) 2026 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


