At 06:29 PM IST on July 2, 2025, the artificial intelligence (AI) community is buzzing about Dr. Fei-Fei Li’s recent talk at Y Combinator’s inaugural AI Startup School, held on June 16-17, 2025, in San Francisco. Known as the “godmother of AI” for her pioneering work with ImageNet, Dr. Li shared her insights on AI’s past, present, and future during a fireside chat with Y Combinator’s General Partner Diana Hu. This article explores her key points, focusing on the origins of computer vision and the critical role of spatial intelligence in achieving Artificial General Intelligence (AGI).
Dr. Li began by reflecting on her foundational contribution to AI ImageNet, a massive dataset she created in 2009 that sparked the deep learning revolution. She described the early days of computer vision, when data scarcity limited progress, and how her belief in data-driven methods led to ImageNet’s success. This shift enabled machines to “see” and recognize objects, a dream she pursued since her PhD days. Her talk highlighted how this breakthrough laid the groundwork for today’s AI models, fulfilling a lifelong goal faster than she anticipated.
Looking ahead, Dr. Li argued that AGI-AI capable of human-like performance across tasks-remains incomplete without spatial intelligence. She emphasized that while language models like ChatGPT excel in one-dimensional communication, the real world is three-dimensional, governed by physics and requiring continuous interaction. As the founder of World Labs, a startup focused on this challenge, she stressed the scarcity of spatial data online and the complexity of inferring 3D environments from 2D images. This, she believes, is the next frontier for AI innovation.
Her vision extends beyond technical hurdles to practical applications. Dr. Li envisions spatial intelligence unlocking creativity and productivity-designing homes more efficiently, aiding medical imaging of the human body, and enabling robots to navigate 3D spaces. She sees this as a step toward embodied AI agents that learn through interaction, drawing parallels to how babies develop understanding. This human-centered approach aligns with her work at Stanford’s Institute for Human-Centered AI, advocating for AI that augments, not replaces, human capabilities.
The talk, attended by 2,000 AI students and researchers, has sparked enthusiasm on social platforms, with many praising her focus on spatial intelligence as a game-changer. However, some question whether the field can overcome data and computational challenges. Dr. Li’s call to action is clear: solving spatial intelligence is essential to realizing AGI’s full potential, marking a pivotal moment for AI’s evolution.
Key Points from Dr. Fei-Fei Li’s Talk
1. ImageNet’s Legacy: Created in 2009, ImageNet drove the deep learning revolution, enabling machines to “see” and transforming computer vision.
2. Spatial Intelligence Need: AGI requires mastery of 3D spatial understanding, a harder challenge than language due to data scarcity and physical complexity.
3. World Labs Focus: Her startup aims to develop AI that navigates and interacts in 3D environments, from home design to medical imaging.
4. Human-Centered AI: Dr. Li advocates for AI that augments human work, drawing inspiration from embodied learning in babies.
5. Future Vision: Solving spatial intelligence could unlock new productivity and creativity, shaping the next phase of AI development.