Categories: Science & Technology

LassoESM: A New AI Tool Accelerating Therapeutic Lasso Peptide Discovery

LassoESM: A New AI Tool Accelerating Therapeutic Lasso Peptide Discovery

AI Meets Molecular Biology: The Rise of LassoESM

In the race to uncover new therapeutics for cancer and infectious diseases, lasso peptides stand out for their knot-like, highly stable structures and diverse biological activities. A collaboration led by researchers at the Carl R. Woese Institute for Genomic Biology has given the field a powerful new tool: LassoESM, a large language model tailored to predict the properties of lasso peptides. Published in Nature Communications, this work merges computational chemistry, machine learning, and experimental validation to streamline peptide design and screening.

What Are Lasso Peptides and Why Do They Matter?

Lasso peptides are natural products produced by bacteria. They are synthesized by ribosome-driven amino acid chains that are then folded into a slip knot-like structure by dedicated biosynthetic enzymes. This structural knot confers remarkable stability and enables a range of biological activities, including antibacterial, antiviral, and anticancer effects. The vast diversity of possible lasso peptides makes them an appealing reservoir for drug discovery, capable of targeting receptors or serving as stable oral therapeutics.

The Challenge of Prediction and the Need for Lasso-Specific AI

Traditional protein AI tools, such as AlphaFold, struggle with lasso peptides because these molecules are not represented well in standard training data and their enzyme–peptide interactions are complex. As Diwakar Shukla (University of Illinois Urbana-Champaign) notes, the unique structure of lasso peptides makes accurate prediction difficult using generic models. The new LassoESM model was designed to capture the specialized language of lasso peptides—their amino acid sequences, three‑dimensional structures, and interactions with the surrounding environment.

How LassoESM Works: From Language to Properties

The team used a two-pronged approach. First, they mined bioinformatics data to assemble thousands of lasso peptide sequences produced by diverse microorganisms, followed by manual validation to ensure data quality. Then they employed masked language modeling—a strategy where parts of a peptide sequence are hidden and the model learns to predict them. This process helps LassoESM “learn the language” of how lasso structures are formed in nature. Once trained, LassoESM can predict properties of lasso peptides and guide experimental work, significantly reducing trial-and-error experiments.

Applications: From Substrate Scope to Therapeutic Design

One key application is predicting which lasso cyclase enzymes can form a given lasso peptide. Lasso cyclases are the enzymatic locks that knot the peptide chain, and matching the right cyclase with a target sequence expands the set of feasible lasso peptides. By forecasting substrate scope using only sequence data, LassoESM opens the door to engineering lasso cyclases and converting a wider array of peptides into functional lasso molecules. This predictive capability is especially valuable for accelerating the rational design of therapeutics and industrial biocatalysts.

Impact and Future Directions

According to researchers, LassoESM demonstrated accurate predictions for a range of lasso peptide properties even with limited training data. The model represents a powerful AI-driven platform to speed up the discovery and optimization of lasso peptides for biomedical and industrial applications. Looking ahead, the team plans to expand LassoESM to accommodate additional prediction tasks and to develop tailor-made language models for other peptide natural products. They also envision engineering lasso peptides to target specific disease-relevant proteins, expanding the therapeutic landscape.

Collaborative Efforts and Acknowledgments

Dr. Diwakar Shukla and colleagues from the University of Illinois Urbana-Champaign and Vanderbilt Institute for Chemical Biology led the computational and experimental work, with additional support from the MMG theme at the Carl R. Woese Institute for Genomic Biology. As one co-leader emphasized, the convergence of computational power, interdisciplinary collaboration, and experimental validation is accelerating the pace at which novel therapeutics can move from concept to clinic.

With LassoESM, the discovery of stable, effective lasso peptides may become more efficient than ever, bringing new options for targeting cancer and infectious diseases while expanding the toolkit of nature-inspired therapeutics.