Overview
This project demonstrates an applied machine learning screening model that estimates the risk of root caries using public CDC NHANES data. The model is optimized for interpretability and early risk stratification, not diagnosis.
Model Summary
- Model: Logistic Regression
- ROC-AUC: ~0.73
- Screening Threshold: 0.10 (recall-focused)
- Dataset: NHANES 2017–2020
Built using CDC NHANES data and evaluated with standard screening metrics.
Interactive Risk Screening
Enter sample values to obtain a screening risk estimate.
First request may take a few seconds while the server initializes.
What factors influence this estimate?
- Age: Risk increases gradually with age
- Smoking: Strongly associated with root caries risk
- Missing teeth: Indicator of long-term oral health history
- Filled teeth: Reflects past dental disease and treatment
- Income-to-poverty ratio: Proxy for access to preventive care
Why this is a screening tool
This model is designed for early risk screening, not diagnosis. The screening threshold is intentionally set low to reduce the chance of missing individuals who may be at elevated risk.
This tradeoff prioritizes recall over precision, appropriate for population-level health screening.
This tool is for educational and screening demonstration purposes only. It does not provide medical advice, diagnosis, or treatment recommendations. Predictions are based on population-level patterns in NHANES data.