Revealing a new AI tool for lasso peptides
In the race to develop novel therapeutics for cancer and infectious diseases, lasso peptides have emerged as a promising class. Their knot-like structures confer remarkable stability and diverse biological activities, making them attractive candidates for future drugs. A collaboration led by the Carl R. Woese Institute for Genomic Biology has produced LassoESM, a dedicated large language model (LLM) designed to predict the properties of lasso peptides. The study, published in Nature Communications, demonstrates how tailor-made AI can unlock the clinical potential of these remarkable natural products.
What makes lasso peptides special?
Lasso peptides are natural products produced by bacteria. They are synthesized by ribosomes as linear chains of amino acids and are then folded by biosynthetic enzymes into a slip knot-like structure. This unique topology grants enhanced stability and a range of biological activities, including antibacterial, antiviral, and anticancer effects. The diversity of lasso peptides arises from thousands of possible sequences generated during biosynthesis, offering a rich landscape for drug discovery.
The need for a lasso-specific AI model
Standard protein language models and AI tools struggle with lasso peptides because of their distinctive structure and limited labeled data. As Diwakar Shukla notes, predicting lasso peptide properties is challenging due to scarce experimental data and complex enzyme–peptide interactions. LassoESM was developed to fill this gap by focusing on the peculiar language of lasso peptides—their sequences, structures, and interactions with cyclases that form the characteristic knot.
How LassoESM works
Researchers first mined bioinformatics data to uncover thousands of lasso peptide sequences, then manually validated new discoveries to ensure data quality. Using a masked language modeling approach, the team trained LassoESM to learn the “language” by hiding parts of the peptide and predicting the missing sections. This pretraining creates a foundation that captures substrate-specific features often missed by generic models, enabling accurate downstream predictions with limited data.
Practical predictions and early successes
With LassoESM, the team could predict which lasso cyclases are compatible with given peptides—crucial for expanding the repertoire of tunable lasso peptides. If a peptide can be paired with a suitable cyclase, it becomes possible to craft new knots and tailor properties for therapeutic use. This approach accelerates the rational design of lasso peptides, reducing trial-and-error experiments and shortening development timelines.
Impact across research and medicine
Mi and Shukla emphasize that LassoESM demonstrates robust predictions even when training data are scarce. The model couples computational insight with experimental data to validate predictions, showcasing a powerful AI‑driven workflow for discovering functional lasso peptides. The implications extend beyond basic science: stable, orally deliverable peptides could transform treatments for cancer, infectious diseases, and other conditions where durability and specificity matter.
Future directions
Looking ahead, the researchers aim to expand LassoESM to new prediction capabilities and to build similar language models for other peptide natural products. They also hope to engineer lasso cyclases further, enabling the conversion of more peptides into knot-forming lasso peptides with desired properties. Harnessing campus computing resources and interdisciplinary collaboration, the team envisions a broader AI‑assisted platform for peptide therapeutics.
Acknowledgments
The study highlights the joint efforts of the University of Illinois Urbana‑Champaign and Vanderbilt collaborators. The integration of computational modeling with experimental validation underscores a growing trend in biomedicine: using AI to guide laboratory discovery and accelerate the journey from sequence to therapeutic candidate.