PrometheusRoot
Blog Links Prometheans 100+ AI Books AI Companies Why are you here?
← Prometheans 100+
×
Sebastian Raschka
builder
EducatorResearcherAuthor
X / Twitter GitHub
educationpytorchllm-from-scratchbooks

Related

pioneer Andrej Karpathy builder Chip Huyen
← Prometheans 100+ Sebastian Raschka

Founder and LLM research engineer at RAIR Lab

Sebastian Raschka

Founder, Principal AI & LLM Research Engineer — RAIR Lab LLM Research Engineer — Lightning AI Assistant Professor (Statistics) — University of Wisconsin-Madison
Listen — profile
0:00 / 1:48

Profile

Sebastian Raschka is the rare ML educator who actually makes you build the thing. His book Build a Large Language Model (From Scratch) walks you through coding a GPT-style transformer in PyTorch — tokenizer, attention heads, pretraining loop, fine-tuning — without pulling in Hugging Face or any other LLM library. The accompanying GitHub repo is one of the clearest teaching artifacts on the internet for anyone who wants to stop treating LLMs as magic boxes.

He’s a PhD statistician turned Staff Research Engineer at Lightning AI, and previously a professor at the University of Wisconsin–Madison. That dual background shows in his work: he explains the math without hiding behind it, and writes code that runs without hand-waving past the hard parts. Before the LLM book he co-authored the bestselling Machine Learning with PyTorch and Scikit-Learn, which for years has been the default “teach yourself ML properly” recommendation.

His newsletter Ahead of AI, read by 150k+ subscribers, is where he breaks down LLM research — LoRA, DPO, positional embeddings, KV caching, reasoning models — with annotated diagrams and minimum-viable code. No hype cycles, no AGI takes. Just “here’s the paper, here’s what’s actually new, here’s what it looks like in 40 lines of PyTorch.”

For a father-son duo learning AI together, Raschka is the bridge. If university gives you the theory and decades of engineering give you the instincts, Raschka’s from-scratch approach is where those meet: code-first, math-honest, no shortcuts.

Books

Build a Large Language Model (From Scratch)
Build a Large Language Model (From Scratch)
2024 ●
Code a GPT-style LLM in PyTorch from tokenizer to instruction fine-tuning, with no LLM libraries — the definitive hands-on book on how transformers actually work.
Build a Large Language Model (From Scratch)

Build a Large Language Model (From Scratch)

Sebastian Raschka — 2024

Practical guide to constructing a GPT-style language model from scratch using Python and PyTorch, covering transformer architectures, attention mechanisms, dataset preparation, pretraining, fine-tuning, and instruction-following capabilities.

Publisher
Manning Publications
Pages
368
ISBN
9781633437166
Published
2024
More →
Machine Learning with PyTorch and Scikit-Learn
Machine Learning with PyTorch and Scikit-Learn
Develop Machine Learning and Deep Learning Models with Python
2022 ●
A comprehensive tour of classical ML and deep learning in Python, co-authored with Yuxi Liu and Vahid Mirjalili — still one of the best self-study ML textbooks.
Machine Learning with PyTorch and Scikit-Learn

Machine Learning with PyTorch and Scikit-Learn

Develop Machine Learning and Deep Learning Models with Python

Sebastian Raschka, Yuxi (Hayden) Liu, Vahid Mirjalili, Dmytro Dzhulgakov — 2022

Publisher
Packt Publishing, Limited
Pages
771
ISBN
9781801819312
Published
2022
More → Amazon
📖
Machine Learning Q and AI
2024 ●
Thirty essential questions and answers covering the modern ML and AI topics that interviews and real work actually hit — attention, LoRA, evaluation, and more.
📖

Machine Learning Q and AI

Sebastian Raschka — 2024

Publisher
No Starch Press, Incorporated
ISBN
9781718503762
Published
2024
More → Amazon

Key Articles & Papers

LLMs from Scratch (GitHub) 2024 — The companion repo to the book — step-by-step PyTorch notebooks for building a ChatGPT-like model, widely used as a standalone curriculum. Understanding and Coding the Self-Attention Mechanism 2023 — One of the clearest written walkthroughs of self-attention anywhere, from scaled dot-product to multi-head, with code you can run. Practical Tips for Finetuning LLMs Using LoRA 2023 — Hard-won empirical results on LoRA rank, alpha, and target modules — the kind of tuning advice you only get from running hundreds of experiments. Understanding Reasoning LLMs 2025 — A methodical breakdown of how reasoning models like DeepSeek-R1 and o1 are actually trained, separating the four main approaches to building them. Understanding Large Language Models 2023 — A curated reading list of the papers that define modern LLMs, annotated with what each one contributed — a great map if you're catching up on the field. New LLM Pre-training and Post-training Paradigms 2024 — Side-by-side comparison of how Qwen, Llama, Gemma, and others are actually trained — recipes, data mixes, and RLHF variants in one place. Ahead of AI Newsletter 2023 — Regular research summaries and technical essays on LLMs — the most consistent signal-to-noise newsletter for practitioners keeping up with the field.

YouTube

YouTube video
2026
YouTube video
2026
YouTube video
2026
YouTube video
2026
YouTube video
2026
YouTube video
2026
YouTube video
2026
YouTube video
2026
YouTube video
2024
YouTube video
2021

Spotify Podcasts

LLM Architecture in 2026: What You Need to Know with Sebastian Raschka
LLM Architecture in 2026: What You Need to Know with Sebastian Raschka
Vanishing Gradients
2026
AI Trends 2026: OpenClaw Agents, Reasoning LLMs, and More with Sebastian Raschka - #762
AI Trends 2026: OpenClaw Agents, Reasoning LLMs, and More with Sebastian Raschka - #762
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
2026
[LIVE] Anthropic Distillation & How Models Cheat (SWE-Bench Dead) | Nathan Lambert & Sebastian Raschka
[LIVE] Anthropic Distillation & How Models Cheat (SWE-Bench Dead) | Nathan Lambert & Sebastian Raschka
Latent Space: The AI Engineer Podcast
2026
State of LLMs 2026: RLVR, GRPO, Inference Scaling — Sebastian Raschka
State of LLMs 2026: RLVR, GRPO, Inference Scaling — Sebastian Raschka
The MAD Podcast with Matt Turck
2026
머신러닝 Q and AI by Sebastian Raschka
머신러닝 Q and AI by Sebastian Raschka
오후 7시
2025
Build a Large Language Model (From Scratch) - Sebastian Raschka
Build a Large Language Model (From Scratch) - Sebastian Raschka
Your Ears Deserve A Treat, Audiobook Can't Be Beat With Audiobooks
2024
Build LLMs From Scratch with Sebastian Raschka #52
Build LLMs From Scratch with Sebastian Raschka #52
AI Stories
2024
Interviewing Sebastian Raschka on the state of open LLMs, Llama 3.1, and AI education
Interviewing Sebastian Raschka on the state of open LLMs, Llama 3.1, and AI education
Interconnects
2024
#197 Sebastian Raschka | Transformers - Deep learning Research - Open Source
#197 Sebastian Raschka | Transformers - Deep learning Research - Open Source
The Ryan Dsouza Podcast
2023
Episode 7: 30 min with data scientist Sebastian Raschka
Episode 7: 30 min with data scientist Sebastian Raschka
Data Science at Home
2016

Related People

pioneer Andrej Karpathy builder Chip Huyen
© 2026 PrometheusRoot