Created GPT and DALL-E at OpenAI
Alec Radford
Profile
Alec Radford is, by Sam Altman’s own description, “a genius at the level of Einstein” — and the quiet engine behind most of what people now mean when they say “generative AI.” He led the original GPT paper. He led GPT-2. He led CLIP. He led DALL-E. He led Whisper. For most of the decade after he joined OpenAI in 2016 as a young researcher fresh out of Olin College, his name appeared first on the papers that defined the field. He never finished a PhD.
The thing to understand about Radford is that the GPT bet was, at the time, contrarian. NLP in 2017 was a forest of task-specific architectures and supervised datasets. Radford’s instinct — heavily influenced by collaboration with Ilya Sutskever — was that you could throw a transformer at a giant pile of unlabeled text, predict the next token, and emergent capability would fall out of scale. The first GPT paper (Improving Language Understanding by Generative Pre-Training, 2018) is short, almost modest. It is also the seed of every frontier model that exists today. He had already done the same thing for images in 2015 with DCGAN, one of the first GAN architectures that actually trained reliably.
What sets Radford apart is taste. He picks the right experiment, runs it cleanly, and writes it up without hedging. CLIP was the bridge that made multimodal models tractable. Whisper was a single, almost casually-released paper that flattened the speech-recognition industry. DALL-E showed that text-to-image was a real category, not a parlor trick. He did all of this without leading a team in the executive sense — he was a technical IC who just kept shipping the field forward.
In December 2024 Radford told colleagues he was leaving OpenAI to do independent research. By April 2025 he had surfaced as an advisor at Thinking Machines Lab, the startup founded by Mira Murati with much of OpenAI’s old senior bench. For developers learning AI today: Radford’s papers are the canon. Read GPT-1, then GPT-2, then CLIP, in that order. You are reading the actual decision points where the modern era was chosen.
Key Articles & Papers
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (DCGAN) Improving Language Understanding by Generative Pre-Training (GPT-1) Language Models are Unsupervised Multitask Learners (GPT-2) Learning Transferable Visual Models From Natural Language Supervision (CLIP) Zero-Shot Text-to-Image Generation (DALL-E) Robust Speech Recognition via Large-Scale Weak Supervision (Whisper)Spotify Podcasts