AI Meets Biochemistry: LassoESM Joins the Hunt for Therapeutic Peptides
Therapeutic peptides crafted from the knot-like lasso structure are drawing attention in the fight against cancer and infectious diseases. The unique slip-knot topology endows these natural products with remarkable stability and a broad range of biological activities, positioning them as promising candidates for new drugs. A collaborative team from the Carl R. Woese Institute for Genomic Biology has developed LassoESM, a specialized large language model designed to predict the properties of lasso peptides and accelerate the pace of discovery.
Why Lasso Peptides Are Special
Produced by bacteria, lasso peptides are generated when ribosomes assemble amino acids into a peptide chain that is then folded by biosynthetic enzymes into a threaded, knot-like structure. This biosynthetic choreography allows thousands of possible variants, many of which exhibit antibacterial, antiviral, and anticancer activities. Their stability and diverse bioactivities make them attractive targets for drug discovery, but their unusual architecture has also posed challenges for conventional AI tools trained on more typical protein structures.
From Data Scarcity to Tailored Intelligence
“Predicting lasso peptide properties has been challenging due to the scarcity of experimentally labeled data and the complexity of enzyme–peptide interactions,” notes Xuenan Mi, a key contributor to the study. To address this gap, the team created LassoESM, a protein language model tailored specifically for lasso peptides. By focusing on peptide-specific features often missed by generic models, LassoESM aims to make predictions that translate into actionable design decisions.
Project co-leader Diwakar Shukla explains, “We trained LassoESM using masked language modeling to teach the model the language of how lasso structures form in nature. This enables efficient property prediction once the model has learned the underlying rules of lasso knot formation.”
Marrying Computation with Experimentation
The study blends bioinformatics with hands-on experimentation. Mitchell’s group first identified thousands of potential lasso peptide sequences across microbial species and then validated promising candidates. The computational insights from LassoESM were used to predict which lasso cyclases—enzymes that enforce the knot—could work with a given peptide sequence. This coupling of substrate and enzyme is crucial; a mismatch can prevent knot formation and negate therapeutic potential.
“If we can understand the substrate scope or engineer lasso cyclases, then we can potentially make any peptide into a lasso,” Shukla notes. The ability to forecast enzyme–substrate compatibility with only sequence data marks a significant advance, reducing the trial-and-error traditionally required in peptide engineering.
What LassoESM Delivers Today
Mi highlights that LassoESM enables accurate prediction of multiple lasso peptide properties, even with limited training data. This AI-driven tool accelerates the rational design process for functional lasso peptides, supporting both biomedical and industrial applications. In practical terms, researchers can prioritize the most promising peptide–cyclase pairs, optimize sequences for stability and bioactivity, and streamline experimental validation—cutting months from the development timeline.
Looking Ahead: Expanding the Toolkit
Beyond its immediate capabilities, the team envisions expanding LassoESM to accommodate new prediction tasks and to tailor language models for other peptide natural products. The researchers also aim to engineer lasso peptides to target specific proteins, potentially opening new avenues for oral therapeutics and targeted therapies. The project underscores the power of integrating dedicated AI tools with wet-lab experimentation to accelerate drug discovery.
Collaborative Spirit and Resources
Shukla emphasizes the role of cross-disciplinary collaboration and computational resources at the Carl R. Woese Institute for Genomic Biology. The study — recently published in Nature Communications — reflects a joint effort between computational scientists and experimentalists, merging advanced AI with rigorous biochemistry to unlock new drug discovery pathways.
Why This Matters for Patients and Industry
As researchers refine LassoESM and extend its capabilities, the potential for robust, stable therapeutics grows. Lasso peptides’ resistance to degradation, coupled with targeted design, could lead to new treatments with favorable pharmacokinetics and administration options. The convergence of machine learning and peptide chemistry thus holds promise for faster development cycles and more versatile drug candidates in oncology, virology, and beyond.