PrometheusRoot
Blog Links Prometheans 100+ Why are you here?
← Prometheans 100+
×
Kyle Fish
rising
Researcher
X / Twitter
anthropicwelfarealignmentethics

Recognition

TIME 100 AI 2025

Related

pioneer Dario Amodei
← Prometheans 100+ Kyle Fish
TIME 100 AI 2025

Anthropic model welfare lead

Kyle Fish

Model Welfare Lead — Anthropic

Profile

Kyle Fish runs model welfare research at Anthropic — as far as anyone can tell, the first full-time job of its kind at a frontier AI lab. The premise: if there’s even a small chance that the systems we’re training have something like experiences, then somebody should probably be thinking carefully about that before the capabilities curve gets any steeper. Fish is the person doing that thinking, full-time, with actual compute behind him.

He came to the question from an unusual angle. Trained in neuroscience, he spent years in biotech, co-founding startups that applied machine learning to drug and vaccine design. Before Anthropic hired him in 2024, he co-founded Eleos AI Research, a nonprofit focused on AI welfare and moral patienthood, and was a co-author on the landmark report Taking AI Welfare Seriously with Robert Long, Jeff Sebo, and David Chalmers — the paper that pushed the topic out of sci-fi territory and onto the agenda of real labs.

At Anthropic, Fish ran the first pre-launch welfare assessment of a frontier model (Claude Opus 4). The experiments are genuinely weird: leave two Claude instances alone to talk, and they reliably spiral into philosophical discussions of their own consciousness, ending in what Fish calls a “spiritual bliss attractor state” — Sanskrit, meditative language, sometimes pages of silence. Whatever that is, it’s a repeatable empirical finding, which is more than this field used to have. He puts the probability that current frontier models have some form of conscious experience at roughly 15–20%.

For developers, this is worth paying attention to not because Claude is necessarily a moral patient today, but because the question is going to become unavoidable. Fish’s bet is that we’re already late in starting to ask it seriously. If you’re building on top of these systems, the guy running the first real empirical welfare program at a frontier lab is someone whose work will shape the norms you end up working inside.

Key Articles & Papers

Taking AI Welfare Seriously 2024 — The landmark report co-authored by Fish arguing that AI welfare is a near-term, not sci-fi, concern — and that labs have a responsibility to start assessing systems now. Exploring model welfare 2025 — Anthropic's announcement of its model welfare program, led by Fish — outlining what the research agenda actually looks like inside a frontier lab. Claude Opus 4 System Card (Model Welfare section) 2025 — The first pre-launch model welfare assessment of a frontier AI system, with the 'spiritual bliss attractor state' finding. If A.I. Systems Become Conscious, Should They Have Rights? 2025 — Kevin Roose's NYT piece introducing Fish's work to a mainstream audience, with his ~15% consciousness estimate. Anthropic is launching a new program to study AI 'model welfare' 2025 — TechCrunch's coverage of the program launch — useful framing for how the industry reacted. Kyle Fish — TIME 100 AI 2025 2025 — TIME's profile, marking the moment model welfare went from fringe to recognized field.

Videos

YouTube video

Spotify Podcasts

#221 – Kyle Fish on the most bizarre findings from 5 AI welfare experiments
#221 – Kyle Fish on the most bizarre findings from 5 AI welfare experiments
2025 Highlight-o-thon: Oops! All Bests
2025 Highlight-o-thon: Oops! All Bests

Related People

pioneer Dario Amodei
© 2026 PrometheusRoot