← Prometheans 100+

builder

Related

← Prometheans 100+

Founder and LLM research engineer at RAIR Lab

Sebastian Raschka

Founder, Principal AI & LLM Research Engineer — RAIR Lab LLM Research Engineer — Lightning AI Assistant Professor (Statistics) — University of Wisconsin-Madison

Listen — profile

0:00 / 1:48

Profile

Sebastian Raschka is the rare ML educator who actually makes you build the thing. His book Build a Large Language Model (From Scratch) walks you through coding a GPT-style transformer in PyTorch — tokenizer, attention heads, pretraining loop, fine-tuning — without pulling in Hugging Face or any other LLM library. The accompanying GitHub repo is one of the clearest teaching artifacts on the internet for anyone who wants to stop treating LLMs as magic boxes.

He’s a PhD statistician turned Staff Research Engineer at Lightning AI, and previously a professor at the University of Wisconsin–Madison. That dual background shows in his work: he explains the math without hiding behind it, and writes code that runs without hand-waving past the hard parts. Before the LLM book he co-authored the bestselling Machine Learning with PyTorch and Scikit-Learn, which for years has been the default “teach yourself ML properly” recommendation.

His newsletter Ahead of AI, read by 150k+ subscribers, is where he breaks down LLM research — LoRA, DPO, positional embeddings, KV caching, reasoning models — with annotated diagrams and minimum-viable code. No hype cycles, no AGI takes. Just “here’s the paper, here’s what’s actually new, here’s what it looks like in 40 lines of PyTorch.”

For a father-son duo learning AI together, Raschka is the bridge. If university gives you the theory and decades of engineering give you the instincts, Raschka’s from-scratch approach is where those meet: code-first, math-honest, no shortcuts.

Books

Build a Large Language Model (From Scratch)

Code a GPT-style LLM in PyTorch from tokenizer to instruction fine-tuning, with no LLM libraries — the definitive hands-on book on how transformers actually work.

Machine Learning with PyTorch and Scikit-Learn

Develop Machine Learning and Deep Learning Models with Python

A comprehensive tour of classical ML and deep learning in Python, co-authored with Yuxi Liu and Vahid Mirjalili — still one of the best self-study ML textbooks.

Machine Learning Q and AI

Thirty essential questions and answers covering the modern ML and AI topics that interviews and real work actually hit — attention, LoRA, evaluation, and more.

Key Articles & Papers

LLMs from Scratch (GitHub) 2024 — The companion repo to the book — step-by-step PyTorch notebooks for building a ChatGPT-like model, widely used as a standalone curriculum. Understanding and Coding the Self-Attention Mechanism 2023 — One of the clearest written walkthroughs of self-attention anywhere, from scaled dot-product to multi-head, with code you can run. Practical Tips for Finetuning LLMs Using LoRA 2023 — Hard-won empirical results on LoRA rank, alpha, and target modules — the kind of tuning advice you only get from running hundreds of experiments. Understanding Reasoning LLMs 2025 — A methodical breakdown of how reasoning models like DeepSeek-R1 and o1 are actually trained, separating the four main approaches to building them. Understanding Large Language Models 2023 — A curated reading list of the papers that define modern LLMs, annotated with what each one contributed — a great map if you're catching up on the field. New LLM Pre-training and Post-training Paradigms 2024 — Side-by-side comparison of how Qwen, Llama, Gemma, and others are actually trained — recipes, data mixes, and RLHF variants in one place. Ahead of AI Newsletter 2023 — Regular research summaries and technical essays on LLMs — the most consistent signal-to-noise newsletter for practitioners keeping up with the field.

YouTube

2026

2026

2026

2026

2026

2026

2026

2026

2024

2021

Spotify Podcasts

LLM Architecture in 2026: What You Need to Know with Sebastian Raschka

Vanishing Gradients

2026

AI Trends 2026: OpenClaw Agents, Reasoning LLMs, and More with Sebastian Raschka - #762

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

2026

[LIVE] Anthropic Distillation & How Models Cheat (SWE-Bench Dead) | Nathan Lambert & Sebastian Raschka

Latent Space: The AI Engineer Podcast

2026

State of LLMs 2026: RLVR, GRPO, Inference Scaling — Sebastian Raschka

The MAD Podcast with Matt Turck

2026

머신러닝 Q and AI by Sebastian Raschka

오후 7시

2025

Build a Large Language Model (From Scratch) - Sebastian Raschka

Your Ears Deserve A Treat, Audiobook Can't Be Beat With Audiobooks

2024

Build LLMs From Scratch with Sebastian Raschka #52

AI Stories

2024

Interviewing Sebastian Raschka on the state of open LLMs, Llama 3.1, and AI education

Interconnects

2024

#197 Sebastian Raschka | Transformers - Deep learning Research - Open Source

The Ryan Dsouza Podcast

2023

Episode 7: 30 min with data scientist Sebastian Raschka

Data Science at Home

2016

Related People