Categories: Forensic Anthropology

Sex estimation from the sternum in Turkish population using machine learning and deep neural networks

Sex estimation from the sternum in Turkish population using machine learning and deep neural networks

Introduction

In forensic anthropology, building a biological profile is essential for identifying unknown human remains. Among the core components, sex estimation informs subsequent analyses of age, stature, and ancestry. The sternum, a central chest bone, has historically received less attention for sex estimation than the skull or pelvis. Recent advances in machine learning and deep neural networks offer new opportunities to extract subtle, population specific patterns from sternum morphology. This article discusses how sex estimation from the sternum in the Turkish population can benefit from a range of traditional and modern computational approaches.

Why focus on the sternum and the Turkish population?

The sternum shows sexual dimorphism that varies across populations, influenced by genetic heritage and environmental factors. In a country like Turkey, which sits at the crossroads of continents, population-specific patterns may differ from those observed in other groups. Leveraging sternum measurements and imaging data, researchers can train models to recognize subtle differences between male and female sternums while accounting for regional variation. Understanding these patterns enhances the accuracy and reliability of the forensic biological profile for individuals of Turkish origin.

Data sources and feature types

Studies typically derive data from sternum radiographs, CT scans, or dry bone measurements collected from Turkish individuals with known sex. Features may include linear measurements (manubrium length, corpus height, sternal body width), area and perimeter descriptors, and shape metrics derived from 3D reconstructions. In some designs, pixel-level or voxel-level image data are used directly, enabling deep learning models to learn complex feature representations beyond handcrafted measurements. Proper data curation, including consistent landmark placement and error analysis, is essential to minimize measurement bias and interobserver variability.

Machine learning approaches

Traditional machine learning

Classic algorithms such as logistic regression, support vector machines, random forests, and gradient boosting are well-suited to smaller datasets. They work effectively with carefully selected sternum features and can provide interpretable models. Feature engineering, normalization, and cross-validation are critical steps to ensure robust performance and to assess generalizability to Turkish subpopulations beyond the training set.

Ensemble methods

Ensemble approaches combine several base models to improve accuracy and stability. For sternum-based sex estimation, ensembles may integrate the strengths of linear models with nonlinear learners, reducing overfitting and enhancing performance on unseen data. Gradient boosting machines, including XGBoost or LightGBM, are popular choices in this space.

Deep neural networks

Deep learning offers two complementary paths. First, CNNs can analyze sternum images or 2D projections to capture intricate morphological cues. Second, transfer learning from networks pretrained on large image datasets can boost performance when sternum-specific data are limited. When 3D data are available, 3D CNNs or graph-based networks can model spatial relationships more naturally. Deep models generally require larger datasets and careful augmentation to prevent overfitting.

Model evaluation and interpretation

Performance is typically assessed using accuracy, area under the ROC curve, sensitivity, and specificity, with cross-validation to estimate generalizability to Turkish specimens not included in the training set. Beyond accuracy, researchers aim to interpret model decisions through feature importance analyses, saliency maps, or SHAP values. Interpretable models help forensic practitioners understand which sternum traits drive sex estimation and how these drivers might vary across Turkish populations.

Implications, limitations, and ethical considerations

While ML and deep learning hold promise, limitations include sample size, potential sampling bias, and the need for standardized measurement protocols. Population-specific models may not generalize across broader groups, underscoring the importance of transparent reporting and validation with diverse datasets. Ethical considerations include respectful handling of human remains, data privacy for donor information, and clear communication of uncertainty in forensic reports.

Future directions

Future work could expand multi-center Turkish datasets, explore multi-trait integration (sternum with pelvis and skull features), and compare cross-population transferability. Hybrid models that fuse handcrafted sternal measurements with deep learning features may yield higher accuracy while retaining some interpretability. As data resources grow, researchers can refine population-specific benchmarks and contribute to more accurate forensic identifications.