Introduction
In forensic anthropology, establishing a biological profile from skeletal remains is a cornerstone of donor identification. While the pelvis and skull remain primary sources for sex estimation, the sternum offers a complementary set of features that can be informative in fragmented or degraded remains. Recent advances in machine learning and deep neural networks provide new ways to quantify sternum morphology and exploit subtle, non-linear patterns that may differ across populations. This article reviews how, in Turkish populations, multiple machine learning (ML) methods and deep learning approaches can enhance sex estimation from sternum measurements and imaging data.
Turkish Population and Sternum Sex Estimation
Sexual dimorphism in the sternum has been documented in various populations, but the degree and pattern of these differences can be population-specific. Turkish samples, with their unique genetic and environmental history, may display sternum traits that respond differently to aging and wear. Population-specific models that incorporate multiple sternum metrics—lengths, widths, angular indices, and 3D coordinates from imaging—are therefore important to improve accuracy and reduce bias in forensic assessments.
Data and Features
A cross-sectional dataset of adult sternums from individuals of Turkish descent provides the foundation for modeling. Measurements typically include linear dimensions such as sternal length (manubrium to xiphoid), width across the manubrium, and sternal body height, as well as curvature descriptors and angular indices. When imaging data are available, CT or radiographic slices can yield coordinate-based features (landmarks) that capture 3D sternum shape. Combining traditional morphometrics with imaging-derived features creates a rich feature space for ML models to exploit.
Modeling Approaches
Researchers apply a spectrum of methods, from traditional ML classifiers to deep neural networks, to predict sex from sternum data:
Classical machine learning
Baseline models such as logistic regression establish a straightforward benchmark for binary sex classification. More flexible algorithms—random forests, gradient boosting (e.g., XGBoost), and support vector machines—handle non-linear relationships and interactions among sternum measurements. Feature engineering, including ratios and indices that normalize size effects, often improves performance and generalizability.
Deep learning
Deep neural networks can leverage high-dimensional data without extensive manual feature engineering. Multilayer perceptrons (MLP) operate on flattened, hand-crafted feature sets, while convolutional networks learn hierarchical patterns directly from imaging data. In sternum-based sex estimation, 2D CNNs may process radiographs or coronal CT slices, and 3D CNNs can analyze volumes to capture complex shape information. Data augmentation and careful regularization are crucial to prevent overfitting when sample sizes are modest.
Evaluation and Validation
Model performance is typically assessed using cross-validation and held-out test sets, reporting metrics such as accuracy, area under the ROC curve (AUC), precision, recall, and F1 scores. Comparative studies often show that ML models outperform univariate or manual morphoscopic methods, with deep learning approaches offering additional gains when imaging data is available. However, the gains depend on data quality, feature representation, and how well the training set represents the target population.
Implications for Forensic Practice
Population-specific sternum-based sex estimation models can augment traditional methods in Turkish forensic cases, especially when pelvis or skull remains are unavailable or damaged. The integration of multi-modal data (morphometrics plus imaging) within a robust ML framework supports more objective, reproducible assessments while highlighting the need for transparent validation and error estimation.
Limitations and Future Directions
Limitations include sample size, potential measurement error, and the challenge of generalizing models beyond the Turkish population. Future work should explore larger, multi-center datasets, standardized imaging protocols, transfer learning to extend models to neighboring populations, and interpretability techniques to elucidate which sternum features drive decisions.
Conclusion
Using machine learning and deep neural networks to estimate sex from the sternum in Turkish populations offers a promising path to improve accuracy and consistency in forensic anthropology. By combining traditional measurements with imaging-derived features, researchers can capture nuanced patterns of sexual dimorphism that enhance biological profiling while acknowledging population-specific considerations.