Categories: Healthcare / Cardiovascular Research

Development and Validation of an AI-Based Model for Cardiovascular Disease Prediction in Iran Using Longitudinal Data

Development and Validation of an AI-Based Model for Cardiovascular Disease Prediction in Iran Using Longitudinal Data

Overview

Cardiovascular disease (CVD) remains a leading cause of mortality in Iran, underscoring the need for accurate, data-driven risk prediction tools. This study introduces an artificial intelligence (AI) based model designed to predict CVD events by leveraging longitudinal health data. By comparing advanced deep learning techniques with a traditional (yet sophisticated) mixed-effects logistic regression approach, the research aims to identify significant predictive factors and provide a framework for reliable, population-specific risk assessment.

Data and Methods

The study utilizes longitudinal health records from a large Iranian cohort, capturing repeated measurements over time for a diverse set of individuals. Key features include demographic information, blood pressure readings, lipid profiles, glucose levels, lifestyle factors, and comorbidities. The longitudinal nature of the data allows the model to account for temporal changes and individual trajectories, which are critical for predicting near-term CVD risk.

Two modeling paradigms are examined:

  • Deep learning models designed to extract nonlinear patterns and temporal dependencies. Recurrent neural networks and temporal convolutional networks are evaluated for their ability to learn from sequences of measurements, with attention mechanisms introduced to highlight influential time points.
  • Mixed-effects logistic regression models that incorporate random effects to capture individual-level heterogeneity and between-subject variability. This approach provides interpretable coefficients for risk factors and accommodates missing data commonly found in longitudinal studies.

Model validation follows a rigorous protocol, including cross-validation across time windows and external validation on a subset of participants not used in model training. Performance metrics include area under the receiver operating characteristic curve (AUC-ROC), calibration plots, Brier scores, and decision-analytic measures such as net benefit. Model explainability techniques, including feature importance analyses and SHAP values for the AI models, are employed to illuminate why certain factors drive predicted risk.

Key Predictive Factors

Preliminary findings indicate that traditional cardiovascular risk factors—age, blood pressure, lipid levels, and glucose metrics—remain potent predictors in the Iranian cohort. However, longitudinal patterns, such as trajectories of blood pressure over time and cumulative exposure to hypertension, offer additional predictive value. Lifestyle indicators (smoking status, physical activity), renal function markers, and comorbidity indices also contribute meaningfully to risk stratification. The deep learning models show the ability to detect complex interactions (e.g., how rising blood pressure combined with specific lipid changes amplifies risk), while the mixed-effects approach provides transparent estimates of each factor’s contribution.

Clinical Implications and Utility

The AI-based model aims to support clinicians and public health officials in identifying high-risk individuals for targeted interventions. By integrating longitudinal data, the model can generate dynamic risk scores that evolve with a patient’s health trajectory, enabling timely preventive decisions such as medication optimization, lifestyle counseling, and closer monitoring. The Iranian context—including prevalent risk factor profiles and healthcare access patterns—highlights the value of locally trained predictive models for improving population health outcomes.

Limitations and Future Directions

As with any data-driven approach, limitations include potential biases in data collection, missing values, and the need for ongoing validation as population risk factors shift. Future work will explore transferability to neighboring regions with similar epidemiological profiles, incorporate additional data streams (e.g., imaging, genomics), and refine interpretability to foster clinician trust and adoption. An ongoing comparative analysis will assess whether hybrid modeling—combining deep learning with interpretable statistical components—offers the best balance of accuracy and usability.

Conclusion

By leveraging longitudinal data and comparing modern AI techniques with robust statistical modeling, this study advances the field of CVD risk prediction in Iran. The resulting models have the potential to improve early detection of cardiovascular events, tailor preventive strategies, and ultimately reduce mortality associated with cardiovascular disease in high-risk populations.