New AI Frontier: Spatial-Temporal Reasoning Trained on Gaming Clips
General Intuition, a new frontier AI research lab spun out from Medal’s video game clip platform, aims to teach agents spatial-temporal reasoning by leveraging a vast trove of gaming videos. The startup has raised $133.7 million in seed funding from top investors, signaling a growing belief that observing how objects move through space and time in video games can yield robust, generalizable AI capabilities. At stake is a foundation-model approach that prioritizes spatial awareness as a core competency for future intelligent systems.
Why Video Game Footage Could Be a Goldmine for AI
Medal operates a platform where users upload and share video game clips. Every year, Medal collects around 2 billion videos from roughly 10 million monthly active users across tens of thousands of games. General Intuition’s leadership argues that this data moat is uniquely suited to teach agents how environments evolve, how objects interact, and how human players parse space over time. “When you play video games, you essentially transfer your perception, usually through a first-person view of the camera, to different environments,” says Pim de Witte, CEO of Medal and General Intuition. He adds that gamers tend to post both extreme successes and failures, yielding valuable edge cases for training AI models that must generalize well to real-world tasks.
The approach centers on spatio-temporal understanding rather than static image recognition. The idea is to train agents that can infer physics, navigate cluttered spaces, and anticipate the consequences of actions as viewed through a human-like camera perspective. This data-rich, behaviorally diverse corpus enables researchers to cultivate agents that can predict what happens next in a scene, a key capability for robotic control, drone navigation, and complex simulations.
From Gaming Worlds to Real-World Robots and Drones
General Intuition plans to apply its trained agents to both entertainment and practical scenarios. In games, the bots and non-player characters (NPCs) could deliver more varied, adaptive experiences beyond deterministic behavior. In real-world contexts, the team envisions agents that can operate in robotics, autonomous vehicles, and search-and-rescue drones—domains where manual data collection is dangerous, expensive, or impractical. The company emphasizes that its models learn through visual input alone, with agents moving by following controller-like actions that resemble how a human would navigate an environment. This alignment with human perception may ease transfer to physical systems that are typically controlled via hand-held devices or remote controllers.
Two Milestones on the Horizon
General Intuition has outlined two near-term goals. First, it aims to generate new simulated worlds to train and stress-test agents, ensuring the models encounter a broad spectrum of spatial scenarios. Second, the company intends to advance autonomous navigation in entirely unfamiliar physical environments, pushing beyond the boundaries of training data. These steps are designed to demonstrate robust generalization, a critical hurdle in the race toward artificial general intelligence (AGI).
A Distinctive Path in the World of AI Models
Unlike peers that focus primarily on building and selling comprehensive world models, General Intuition is pursuing applications that sidestep copyright concerns while still delivering powerful agent behavior. While players and developers create the data, the team argues that “the goal is not to produce models that compete with game developers.” Instead, the focus is on enabling smarter bots and NPCs that adapt to a range of difficulties and play styles, thereby improving engagement and retention in gaming contexts and expanding utility in real-world robotics and navigation tasks.
Moritz Baier-Lentz, a founding member and partner at Lightspeed Ventures, stresses the practical payoff: a scalable, adaptable bot capable of maintaining a fair challenge in diverse situations. “It’s not compelling to create a god bot that beats everyone, but if you can scale gradually and fill in liquidity for any player situation so that their win rate is always around 50%, that will maximize engagement and retention.”
Founding Vision and AGI Implications
CEO Pim de Witte brings humanitarian experience to the venture, shaping a mission that includes search-and-rescue drones capable of navigating unfamiliar spaces where GPS may fail. The company positions spatial-temporal reasoning as a critical ingredient in the broader pursuit of AGI, arguing that language models alone cannot capture the intuitive sense of space and motion that humans rely on daily. “As humans, we create text to describe what’s going on in our world, but in doing so, you lose a lot of information,” de Witte notes. The hope is that grounding AI agents in visual, action-based data will complement language-driven systems and accelerate a more generalizable form of intelligence.
Funding and Forward Momentum
The seed round, totaling $133.7 million, was led by Khosla Ventures and General Catalyst with participation from Raine. The sizable investment underscores investor confidence in the potential of spatial-temporal training data drawn from gaming videos and the team’s ability to translate that into practical agents for both virtual and real environments.
As General Intuition begins to scale its research and engineering teams, industry watchers will be watching to see whether spatial-temporal grounding in gaming footage can deliver durable advantages in both simulated worlds and the physical one, marking a meaningful step toward AGI that can reason about space, time, and action in a human-like way.