PrometheusRoot
Blog Links Prometheans 100+ Why are you here?
← Prometheans 100+
×
David Silver
builder
Researcher
Website Wikipedia
deepmindalphagoalphazeroreinforcement-learning

Related

pioneer Demis Hassabis
← Prometheans 100+ David Silver

Led AlphaGo, AlphaZero — AI that masters games

David Silver

Principal Research Scientist — Google DeepMind

Profile

David Silver is the reinforcement learning researcher who made the world pay attention. As lead researcher on AlphaGo at Google DeepMind, he built the system that beat Lee Sedol 4-1 in 2016 — a moment widely considered AI’s “Sputnik” for the public. Go had been the grand unsolved challenge of game AI for decades, with most researchers estimating superhuman play was still ten years away. Silver’s team did it with deep neural networks, Monte Carlo tree search, and a brutal amount of self-play.

Then he did it again, harder. AlphaGo Zero threw out human game data entirely and learned from scratch, surpassing the version that beat Lee Sedol in three days. AlphaZero generalized the method to chess and shogi, and MuZero removed the last piece of hand-coded knowledge — the rules themselves — learning a world model from pixels and rewards alone. If you want to understand the modern lineage of self-play, planning, and learned world models, this is the trajectory to study.

Silver did his PhD at the University of Alberta under Rich Sutton, the godfather of reinforcement learning, and he carries that torch. His 2021 paper “Reward Is Enough” (with Sutton and others) argues that reward maximization alone is sufficient to drive the emergence of intelligence — a philosophical bet on RL as the path to AGI that lands very differently in a world obsessed with next-token prediction. He’s a Principal Research Scientist at DeepMind and teaches reinforcement learning at UCL, where his 2015 lecture series remains the most-watched RL course on the internet.

For developers building with AI today, Silver matters because RL is back. RLHF powers every frontier chatbot, and the new wave of reasoning models (OpenAI o1, DeepMind’s Gemini thinking, Anthropic’s extended thinking) lean heavily on the same self-play and search ideas Silver pioneered in games. His work is the intellectual foundation for an increasingly large chunk of the frontier.

Key Articles & Papers

Mastering the game of Go with deep neural networks and tree search 2016 — The original AlphaGo paper — the one that beat Lee Sedol and changed public perception of AI overnight. Mastering the game of Go without human knowledge 2017 — AlphaGo Zero. Discard the human games, start from random play, surpass every prior version in 72 hours. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play 2018 — AlphaZero. Same algorithm, three games, superhuman everywhere. Proof that the method generalizes. Mastering Atari, Go, chess and shogi by planning with a learned model 2020 — MuZero — the rules of the game are no longer given. The agent learns a world model and plans inside it. Grandmaster level in StarCraft II using multi-agent reinforcement learning 2019 — AlphaStar. Extending self-play to real-time strategy with imperfect information and long horizons. Reward is enough 2021 — With Rich Sutton and others — the strongest case for reward maximization as a sufficient path to general intelligence. Deterministic Policy Gradient Algorithms 2014 — DPG — the foundation for DDPG and much of modern continuous-control RL.

Videos

YouTube video
YouTube video
YouTube video

Spotify Podcasts

Is Human Data Enough? With David Silver
Is Human Data Enough? With David Silver
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
Three Bald Men Deciding Who Looks More Like This Bald Man with David Silver
Three Bald Men Deciding Who Looks More Like This Bald Man with David Silver
Cooking Up Success: A Conversation with Chef David Burke
Cooking Up Success: A Conversation with Chef David Burke
Sarah Silverman
Sarah Silverman
More and more WNBA negotiations! NFL wants more money from Paramount! Uh oh! (Episode 1426 Hour 2)
More and more WNBA negotiations! NFL wants more money from Paramount! Uh oh! (Episode 1426 Hour 2)
The Mayflower | The Thanksgiving Myth | 5
The Mayflower | The Thanksgiving Myth | 5
Sarah Silverman
Sarah Silverman
Adam Silver Bad (March 29th edition)
Adam Silver Bad (March 29th edition)
Why Financial Education Matters More Than Saving Money
Why Financial Education Matters More Than Saving Money

Related People

pioneer Demis Hassabis
© 2026 PrometheusRoot