Categories: Science

AI protein language model uncovers how convergent evolution arises in nature

AI protein language model uncovers how convergent evolution arises in nature

New AI tool sheds light on convergent evolution

A team of Chinese scientists has taken a significant step in understanding how distant life forms independently develop similar abilities when faced with comparable environmental challenges. By applying a sophisticated artificial intelligence (AI) protein language model, the researchers uncovered a key mechanism that explains convergent evolution—the repeated emergence of similar traits in different evolutionary lineages. Their findings illuminate why bats and toothed whales, despite their distant relationship, evolved the remarkable sense of echolocation to navigate and hunt in their respective worlds.

What is convergent evolution and why it matters?

Convergent evolution, sometimes described as functional convergence, occurs when unrelated species evolve similar features or capabilities because they occupy similar ecological niches. This pattern suggests that certain environmental pressures strongly favor specific functional solutions. In the case of echolocation, both bats and toothed whales rely on sound to detect objects, distance, and texture in the dark or murky waters. Yet their ancestors diverged hundreds of millions of years ago. The new study demonstrates that the answer to this paradox lies not just in simple amino acid changes, but in higher-order protein features that govern how proteins fold, interact, and perform complex tasks.

The ACEP framework and the power of protein language models

The Institute of Zoology at the Chinese Academy of Sciences led the research, with a team headed by Zou Zhengting. They introduced a computational analysis framework named ACEP. The core innovation of ACEP rests on leveraging a pre-trained protein language model, a form of AI that reads and interprets the language of amino acids—the building blocks of life. By training on vast databases of protein sequences, the model learns to recognize patterns that correspond to structure, function, and interaction tendencies far beyond what traditional analyses could capture.

“A protein language model can understand the deeper structural and functional characteristics and patterns behind amino acid sequences,” Zou explained. This capability enables ACEP to identify high-order features that influence how proteins contribute to biological functions, including those required for echolocation or other environmentally adapted traits.

Key findings and their implications

The team found that convergent evolution is strongly guided by high-order protein features rather than just straightforward sequence similarity. These features describe how proteins fold, assemble into complexes, and interact with cellular partners to produce robust, adaptive outcomes. The implication is profound: AI-driven models can reveal latent biological rules that explain why different organisms converge on similar solutions when facing analogous ecological demands.

The research was published in the Proceedings of the National Academy of Sciences, signaling international recognition of this approach. The study not only advances fundamental evolutionary biology but also demonstrates AI’s potential to solve complex biological questions that have historically required lengthy comparative analyses and experimental validation.

Future directions: AI in evolutionary biology

Beyond explaining echolocation in bats and toothed whales, the ACEP framework holds promise for exploring a wide range of convergent traits in nature—from sensory systems to metabolic pathways. The researchers emphasize that AI is not a replacement for experiments but a powerful complement, guiding hypotheses and narrowing the search for functional mechanisms that would be difficult to detect through conventional methods alone.

“We hope to achieve broader and more effective application of AI technology in evolutionary biology in the future,” Zou stated. The study thus marks a milestone in marrying computational innovation with classic evolutionary questions, offering a path toward a deeper, data-driven understanding of life’s shared strategies in the face of similar environmental pressures.