Dan Hendrycks

Director — Center for AI Safety Advisor — Scale AI Safety Adviser — xAI

Listen — profile

0:00 / 2:39

Profile

Dan Hendrycks is the Executive Director of the Center for AI Safety, and arguably the single most influential voice in AI safety who is also taken seriously inside frontier labs. He earned his PhD from UC Berkeley under Dawn Song and Jacob Steinhardt, and made his name not by writing think pieces but by building the rulers everyone uses to measure progress. If you’ve ever cited a model’s MMLU or MATH score, you’ve used a Hendrycks benchmark. If you’ve trained a transformer with a GELU activation — which is to say, almost any modern transformer including BERT and GPT — you’re using a function he and Kevin Gimpel proposed as undergraduates in 2016.

His benchmark work is unusually load-bearing for the field. MMLU, introduced in 2020, became the default test for measuring whether a language model has broad knowledge across 57 subjects from elementary math to professional law — and the score every lab races to beat. MATH did the same for mathematical reasoning. When MMLU saturated as models got better, he co-built Humanity’s Last Exam with Alexandr Wang’s Scale AI: 2,500 expert-level questions designed so that even GPT-5-class models score in the 30s. The pattern is deliberate — give the field a target, watch it get hit, raise the bar.

The other half of his work is policy and existential risk. He runs CAIS, organized the 2023 one-sentence statement on extinction risk signed by Geoffrey Hinton, Yoshua Bengio, Sam Altman and Demis Hassabis, and was a principal architect of California’s SB 1047 — the frontier-AI safety bill ultimately vetoed by Governor Newsom in 2024. He is also a paid safety advisor to Elon Musk’s xAI and, since late 2024, to Scale AI. In March 2025 he co-authored “Superintelligence Strategy” with Eric Schmidt and Wang, proposing a deterrence framework called Mutual Assured AI Malfunction (MAIM) modeled on Cold War MAD.

For developers, Hendrycks is the rare safety figure whose work you can’t ignore even if you find the x-risk discourse exhausting. The benchmarks are real engineering. The textbook is genuinely useful. And his policy positions — whether you agree or not — are shaping the regulatory environment that any AI startup will operate inside.

Books

Introduction to AI Safety, Ethics, and Society

The first comprehensive textbook on AI safety, covering deep learning fundamentals, risk taxonomies, governance, and ethics — free online and used as the basis for CAIS's safety course.

Key Articles & Papers

Gaussian Error Linear Units (GELUs) 2016 — The activation function that quietly powers most modern transformers, including BERT, GPT, and Vision Transformers. Measuring Massive Multitask Language Understanding (MMLU) 2020 — Introduced the 57-subject benchmark that became the default knowledge test for every frontier LLM. Measuring Mathematical Problem Solving With the MATH Dataset 2021 — The competition-math benchmark that drove much of the work on chain-of-thought and reasoning models. Unsolved Problems in ML Safety 2021 — A research agenda that organized AI safety into robustness, monitoring, alignment, and systemic safety — widely used as a starting reading list. X-Risk Analysis for AI Research 2022 — A framework for evaluating which capabilities research contributes to existential risk and which mitigates it. An Overview of Catastrophic AI Risks 2023 — Hendrycks's most-cited risk taxonomy: malicious use, AI race, organizational risks, and rogue AIs. Statement on AI Risk 2023 — The one-sentence CAIS statement comparing AI extinction risk to pandemics and nuclear war, signed by hundreds of leading researchers. Humanity's Last Exam 2025 — A 2,500-question expert-level benchmark designed to outlast frontier models, built with Scale AI. Superintelligence Strategy 2025 — With Eric Schmidt and Alexandr Wang — proposes a Mutual Assured AI Malfunction (MAIM) deterrence regime for state-level AI competition. Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? 2024 — A pointed critique showing many 'safety' benchmarks correlate with general capability — a self-aware reckoning from the field's leading benchmark builder.

Videos

Controversies

SB 1047 and the Gray Swan conflict-of-interest claim. During the 2024 fight over California’s SB 1047 — which Hendrycks helped draft and CAIS sponsored — critics including a16z and outlets like Pirate Wires noted he was a co-founder and investor in Gray Swan AI, a model-auditing startup that stood to benefit from compliance demand the bill would create. Hendrycks announced he would divest from Gray Swan and continue as an unpaid advisor; the bill was vetoed by Governor Newsom in September 2024.

Humanity’s Last Exam answer quality. A July 2025 investigation by FutureHouse found that roughly 30% of HLE’s text-only chemistry and biology answers may be wrong; the CAIS/Scale team partially confirmed the findings and committed to a continuous-revisions process. A reminder that even the bar-raising benchmarks need their own QA.