Categories: Geotechnical engineering

A Data-Driven Framework for Predicting Rock Shear Strength Parameters with Interpretability

A Data-Driven Framework for Predicting Rock Shear Strength Parameters with Interpretability

Introduction

Predicting rock shear strength parameters is crucial for safe and cost-effective rock engineering projects. Traditional models often struggle with nonlinearities and complex interactions in geological data. This article presents a data-driven framework that combines advanced machine learning with interpretability analysis to predict rock shear strength parameters more accurately and transparently. The framework addresses two common challenges: unavoidable nonlinear relationships in rock mechanics data and the hyperparameter random selection problem that can derail model performance.

Why a Data-Driven Approach?

Rock shear strength depends on a myriad of factors including mineralogy, grain size, porosity, stress state, temperature, and loading history. A data-driven approach can learn these intricate relationships from field and laboratory data, outperforming conventional empirical or closed-form models. Importantly, interpretable models reveal which features drive predictions, supporting engineering judgment and risk management.

Framework Overview

The proposed framework integrates data preprocessing, model training with robust hyperparameter tuning, and interpretability analysis. It consists of four layers:

  • Data Preparation: Curating a diverse dataset of rock samples with measured shear strength parameters, standardizing features, and handling missing values.
  • Model Ensemble: Employing multiple algorithms (for example, Gradient Boosting, Random Forest, and Gaussian Process Regression) to capture nonlinear patterns while reducing overfitting.
  • Hyperparameter Tuning: Systematically exploring hyperparameters to identify high-performance configurations while mitigating randomness in selection through cross-validation and stability checks.
  • Interpretability: Applying SHAP values and feature importance analyses to explain predictions and identify dominant physical factors influencing shear strength.

Handling Hyperparameter Randomness

Random sampling in hyperparameter tuning can lead to inconsistent results across runs. The framework uses structured search strategies—such as Bayesian optimization for efficiency and nested cross-validation for reliability—to minimize variability. This approach ensures that the reported performance reflects the model’s true capability rather than lucky hyperparameter choices.

Modeling Techniques and Feature Engineering

To capture the nonlinear interactions in rock mechanics data, the framework relies on tree-based ensemble methods and kernel-based regressors. Key techniques include:

  • Gradient Boosting Machines (GBM) to model complex relationships with boosting of weak learners.
  • Random Forests to reduce variance and improve robustness against outliers.
  • Gaussian Process Regression for probabilistic predictions and uncertainty quantification.
  • Feature engineering such as rock type indicators, texture indices, stress state ratios, and interaction terms between porosity and cementation.

These methods are complemented by normalization and scaling, along with careful handling of missing data through imputation strategies aligned with geological context.

Interpretability and Validation

Interpretability is essential in rock engineering to convert model outputs into actionable insights. The framework emphasizes:

  • SHAP (SHapley Additive exPlanations) values to quantify each feature’s contribution to a given prediction, helping engineers understand why a rock’s shear strength is estimated as it is.
  • Global feature importance to identify which features consistently drive predictions across the dataset.
  • Partial dependence plots to visualize how changes in key features affect predicted shear strength in a physically meaningful way.

Validation is performed with hold-out test sets and forward-looking cross-validation to ensure that the predictions generalize to new rock types and geological settings. Uncertainty estimates accompany predictions to guide risk assessment and decision-making.

Applications and Impact

Accurate and interpretable predictions of rock shear strength parameters enable better design of excavation supports, slope stability assessments, tunnel linings, and rockfall mitigation. The data-driven approach provides a transparent rationale for design choices, aligns with regulatory expectations, and supports ongoing knowledge discovery in geology and rock mechanics.

Conclusion

The integration of data-driven learning with interpretability analysis offers a powerful path forward for predicting rock shear strength parameters. By balancing nonlinear modeling capability with transparent explanations, the framework improves accuracy, reduces the risk of overfitting due to hyperparameter randomness, and enhances trust among engineers and stakeholders.