Pioneering AI and Bayesian Analytics at UT Arlington
At The University of Texas at Arlington, a team of data scientists is pushing the boundaries of disease research by marrying artificial intelligence with principled Bayesian statistics. Their work aims to unlock rapid, interpretable insights from massive biological datasets—especially CyTOF and single-cell sequencing—so scientists can better understand how diseases start, how the immune system responds, and which treatments show the most promise.
Meet the Scientist Leading the Charge
Xinlei (Sherry) Wang, Jenkins Garrett professor of statistics and data science in UT Arlington’s Department of Mathematics, leads this ambitious program. Wang is also the founding director for research in the Division of Data Science. She recently received a four-year, $1.28 million federal grant to advance the project, titled “Statistical and Deep Generative Modeling for Enhanced CyTOF Data Interpretation and Discovery.”
CyTOF and the Single-Cell Challenge
CyTOF is a cutting-edge technology that analyzes thousands of individual cells in one go, measuring dozens of proteins per cell. While this yields a treasure trove of biological information, translating it into actionable science is daunting. The UT Arlington team is building a toolkit that helps researchers move from raw, high-dimensional data to clear, interpretable insights. Their goal is a “one-stop shop” for CyTOF analysis—an integrated platform that blends statistical rigor with scalable AI techniques.
A Bayesian Framework for Transparent AI
Central to the team’s thrust is a Bayesian framework that produces interpretable results. Rather than a black-box model, this approach lays bare how the data are generated and how conclusions are drawn. For example, a model parameter might quantify increased protein expression in a disease group relative to healthy controls. The interpretability is crucial for validating discoveries and guiding experimental follow-up.
Why AI Accelerates Discovery
AI accelerates the research cycle in two key ways. First, it uncovers subtle, previously hidden relationships in complex single-cell data—patterns that might be missed by traditional methods. Second, it dramatically reduces analysis time. Where non-AI workflows could take days to yield results for millions of cells, AI-enhanced Bayesian methods can produce reliable outcomes within seconds, while preserving uncertainty quantification and model transparency.
Integrating Diverse Data Types
The UT Arlington team combines CyTOF with single-cell transcriptomics (next-generation sequencing) to paint a fuller picture of cellular states. By integrating protein-level data with gene expression, researchers can identify distinct cell types, compare healthy versus diseased cells, and trace how regulatory networks respond in illness. This holistic view is essential for identifying new therapeutic targets and optimizing treatment strategies, including cancer therapies where cellular heterogeneity drives resistance and relapse.
Team, Impact, and Recognition
Wang’s team includes colleagues from UT Arlington’s Division of Data Science, as well as collaborators from mathematics and environmental sciences. Industry and academic partnerships strengthen the project’s reach, with recent milestones reflecting its growing impact. For instance, a UT Arlington doctoral graduate, now a tenure-track professor at another institution, earned the Best PhD Poster Award for presenting early results from this line of work. In Nature Communications, a related study introduced a tool named BIT—Bayesian Identification of Transcriptional Regulators from Epigenomics-Based Query Regions Sets—that advances the accuracy of gene regulation research using Bayesian logic.
Open Science for Real-World Use
One of Wang’s core missions is to democratize these powerful tools. “AI is powerful, but it’s often a black box,” she notes. The team is committed to developing user-friendly, open-source software that end users can run on laptops. The vision blends statistical rigor with uncertainty quantification and scalability, delivering trustworthy results that clinicians and biologists can act on with confidence.
Looking Ahead
As the UT Arlington initiative continues, the integration of AI and Bayesian modeling with CyTOF and single-cell data promises to accelerate discoveries in cancer biology, immunology, and beyond. By turning massive, intricate datasets into transparent discoveries, the team aims to shorten the path from data to diagnosis, enabling faster, more precise treatments for patients.