Objectives: Reliable, accessible, noninvasive self-assessment screening for prediabetes/diabetes is lacking, leading to missed opportunities for early intervention. We aimed to develop and externally validate a machine learning (ML)-derived self-assessment system to predict the likelihood of prevalent prediabetes or diabetes using easily accessible health parameters. Study design and setting: We analyzed 30 years (1988-2018) of National Health and Nutrition Examination Survey (NHANES) data (N= 17,458). ML models predicting prediabetes/diabetes risk (composite outcome: fasting plasma glucose ≥100 mg/dL or HbA1c ≥ 5.7% [≥39 mmol/mol]) were developed using multimodal data. The Boruta algorithm identified key predictors. Multiple ML models were compared; the best performer (neural network) formed the Machineborne Early Diabetic Warning And Control System (MEDWACS). A final set of seven easily accessible parameters suitable for self-assessment was selected. Performance was assessed via receiver operating characteristic curve (ROCAUC) and calibration. External validation used NHANES 2021-2023 (N= 3043) and Korea NHANES 2023 (N= 5492). Results: The final seven-parameter MEDWACS model included age, waist circumference, systolic blood pressure, gender, upper leg length, arm circumference, and body mass index. Internally, MEDWACS achieved ROCAUC 0.804 (95% confidence interval, 0.792-0.816) with robust subpopulation performance. External validation confirmed strong performance and generalizability (ROCAUCs: US 0.773 [0.756-0.790], Korea 0.780 [0.768-0.792]) and good calibration. Interpretability analysis identified key drivers. Decision curve analysis showed that MEDWACS had superior clinical utility compared to the established screening guidelines. An online tool was developed to facilitate home-based self-assessment and clinical use. Conclusion: MEDWACS provides a validated, noninvasive ML risk stratification tool using seven accessible parameters to identify individuals likely having prevalent prediabetes or diabetes. It can aid in prompting timely clinical evaluations, potentially reducing the public health burden. Plain language summary: Many people have prediabetes or type 2 diabetes and don't know it because early detection often requires clinic visits and blood tests. These can be significant barriers for some individuals. To address this, we developed a free, easy-to-use online tool called MEDWACS (Machineborne Early Diabetic Warning And Control System). Using 30 years of health data from the US National Health and Nutrition Examination Survey (NHANES), we used advanced computer learning methods to build a system that predicts a person's risk of having prediabetes or diabetes. The tool only needs seven simple pieces of information that can be measured at home, such as age, body mass index, waist circumference, and blood pressure, with no laboratory tests required. We confirmed that the tool is accurate using data from people in both the United States and South Korea. Our analysis showed that MEDWACS is more effective at identifying high-risk individuals than the current screening guidelines recommended by major US health organizations. MEDWACS can serve as a simple, no-cost first step for people to check their own risk. It is not a replacement for a doctor's diagnosis, but a high-risk result can empower individuals to seek timely medical evaluation. By making early risk detection more accessible, we hope that MEDWACS can help more people get diagnosed and treated sooner, potentially reducing the serious health complications of diabetes.

Enhancing prediabetes and diabetes detection through a machine learning-enabled self-assessment approach / Yoo, Daniel; Maggiore, Umberto; Jolliet, Olivier. - In: JOURNAL OF CLINICAL EPIDEMIOLOGY. - ISSN 0895-4356. - (2026).

Enhancing prediabetes and diabetes detection through a machine learning-enabled self-assessment approach

Umberto Maggiore;
2026-01-01

Abstract

Objectives: Reliable, accessible, noninvasive self-assessment screening for prediabetes/diabetes is lacking, leading to missed opportunities for early intervention. We aimed to develop and externally validate a machine learning (ML)-derived self-assessment system to predict the likelihood of prevalent prediabetes or diabetes using easily accessible health parameters. Study design and setting: We analyzed 30 years (1988-2018) of National Health and Nutrition Examination Survey (NHANES) data (N= 17,458). ML models predicting prediabetes/diabetes risk (composite outcome: fasting plasma glucose ≥100 mg/dL or HbA1c ≥ 5.7% [≥39 mmol/mol]) were developed using multimodal data. The Boruta algorithm identified key predictors. Multiple ML models were compared; the best performer (neural network) formed the Machineborne Early Diabetic Warning And Control System (MEDWACS). A final set of seven easily accessible parameters suitable for self-assessment was selected. Performance was assessed via receiver operating characteristic curve (ROCAUC) and calibration. External validation used NHANES 2021-2023 (N= 3043) and Korea NHANES 2023 (N= 5492). Results: The final seven-parameter MEDWACS model included age, waist circumference, systolic blood pressure, gender, upper leg length, arm circumference, and body mass index. Internally, MEDWACS achieved ROCAUC 0.804 (95% confidence interval, 0.792-0.816) with robust subpopulation performance. External validation confirmed strong performance and generalizability (ROCAUCs: US 0.773 [0.756-0.790], Korea 0.780 [0.768-0.792]) and good calibration. Interpretability analysis identified key drivers. Decision curve analysis showed that MEDWACS had superior clinical utility compared to the established screening guidelines. An online tool was developed to facilitate home-based self-assessment and clinical use. Conclusion: MEDWACS provides a validated, noninvasive ML risk stratification tool using seven accessible parameters to identify individuals likely having prevalent prediabetes or diabetes. It can aid in prompting timely clinical evaluations, potentially reducing the public health burden. Plain language summary: Many people have prediabetes or type 2 diabetes and don't know it because early detection often requires clinic visits and blood tests. These can be significant barriers for some individuals. To address this, we developed a free, easy-to-use online tool called MEDWACS (Machineborne Early Diabetic Warning And Control System). Using 30 years of health data from the US National Health and Nutrition Examination Survey (NHANES), we used advanced computer learning methods to build a system that predicts a person's risk of having prediabetes or diabetes. The tool only needs seven simple pieces of information that can be measured at home, such as age, body mass index, waist circumference, and blood pressure, with no laboratory tests required. We confirmed that the tool is accurate using data from people in both the United States and South Korea. Our analysis showed that MEDWACS is more effective at identifying high-risk individuals than the current screening guidelines recommended by major US health organizations. MEDWACS can serve as a simple, no-cost first step for people to check their own risk. It is not a replacement for a doctor's diagnosis, but a high-risk result can empower individuals to seek timely medical evaluation. By making early risk detection more accessible, we hope that MEDWACS can help more people get diagnosed and treated sooner, potentially reducing the serious health complications of diabetes.
2026
Enhancing prediabetes and diabetes detection through a machine learning-enabled self-assessment approach / Yoo, Daniel; Maggiore, Umberto; Jolliet, Olivier. - In: JOURNAL OF CLINICAL EPIDEMIOLOGY. - ISSN 0895-4356. - (2026).
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11381/3056314
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact