Center for AI Safety director, Scale AI advisor
Dan Hendrycks
Profile
Dan Hendrycks is the Executive Director of the Center for AI Safety, and arguably the single most influential voice in AI safety who is also taken seriously inside frontier labs. He earned his PhD from UC Berkeley under Dawn Song and Jacob Steinhardt, and made his name not by writing think pieces but by building the rulers everyone uses to measure progress. If you’ve ever cited a model’s MMLU or MATH score, you’ve used a Hendrycks benchmark. If you’ve trained a transformer with a GELU activation — which is to say, almost any modern transformer including BERT and GPT — you’re using a function he and Kevin Gimpel proposed as undergraduates in 2016.
His benchmark work is unusually load-bearing for the field. MMLU, introduced in 2020, became the default test for measuring whether a language model has broad knowledge across 57 subjects from elementary math to professional law — and the score every lab races to beat. MATH did the same for mathematical reasoning. When MMLU saturated as models got better, he co-built Humanity’s Last Exam with Alexandr Wang’s Scale AI: 2,500 expert-level questions designed so that even GPT-5-class models score in the 30s. The pattern is deliberate — give the field a target, watch it get hit, raise the bar.
The other half of his work is policy and existential risk. He runs CAIS, organized the 2023 one-sentence statement on extinction risk signed by Geoffrey Hinton, Yoshua Bengio, Sam Altman and Demis Hassabis, and was a principal architect of California’s SB 1047 — the frontier-AI safety bill ultimately vetoed by Governor Newsom in 2024. He is also a paid safety advisor to Elon Musk’s xAI and, since late 2024, to Scale AI. In March 2025 he co-authored “Superintelligence Strategy” with Eric Schmidt and Wang, proposing a deterrence framework called Mutual Assured AI Malfunction (MAIM) modeled on Cold War MAD.
For developers, Hendrycks is the rare safety figure whose work you can’t ignore even if you find the x-risk discourse exhausting. The benchmarks are real engineering. The textbook is genuinely useful. And his policy positions — whether you agree or not — are shaping the regulatory environment that any AI startup will operate inside.
Books
Key Articles & Papers
Gaussian Error Linear Units (GELUs) Measuring Massive Multitask Language Understanding (MMLU) Measuring Mathematical Problem Solving With the MATH Dataset Unsolved Problems in ML Safety X-Risk Analysis for AI Research An Overview of Catastrophic AI Risks Statement on AI Risk Humanity's Last Exam Superintelligence Strategy Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?Videos
Controversies
SB 1047 and the Gray Swan conflict-of-interest claim. During the 2024 fight over California’s SB 1047 — which Hendrycks helped draft and CAIS sponsored — critics including a16z and outlets like Pirate Wires noted he was a co-founder and investor in Gray Swan AI, a model-auditing startup that stood to benefit from compliance demand the bill would create. Hendrycks announced he would divest from Gray Swan and continue as an unpaid advisor; the bill was vetoed by Governor Newsom in September 2024.
Humanity’s Last Exam answer quality. A July 2025 investigation by FutureHouse found that roughly 30% of HLE’s text-only chemistry and biology answers may be wrong; the CAIS/Scale team partially confirmed the findings and committed to a continuous-revisions process. A reminder that even the bar-raising benchmarks need their own QA.
YouTube
Spotify Podcasts