Liang Wenfeng

Founder and CEO — DeepSeek Co-Founder — High-Flyer

Listen — profile

0:00 / 2:32

Profile

Liang Wenfeng is the founder and CEO of DeepSeek, the Chinese AI lab that turned the global AI industry upside down in January 2025. A quiet, unusually technical founder by Chinese tech standards, he is the rare CEO who still writes code, reads papers, and hires researchers with the same instincts as Sam Altman or Dario Amodei — except he does it from Hangzhou, on sanctioned hardware, and gives the weights away.

Born in 1985 in a small town in Guangdong, Liang studied electronic information engineering at Zhejiang University. In 2015 he co-founded High-Flyer, a quantitative hedge fund that used machine learning to trade Chinese equities. High-Flyer made him wealthy and, more importantly, gave him a reason to stockpile thousands of NVIDIA A100 GPUs before US export controls locked China out of the best silicon. In 2023 he spun that GPU hoard into DeepSeek, an AI research lab with a culture modeled openly on early OpenAI — young researchers, no KPIs, no product pressure, just training runs.

The payoff came fast. DeepSeek-V2 in mid-2024 introduced Multi-head Latent Attention and a sparse MoE architecture that slashed inference costs. DeepSeek-V3 landed in December 2024 trained for a reported $5.6M in GPU hours. Then came DeepSeek-R1 in January 2025 — an open-weight reasoning model that matched OpenAI’s o1 on math and coding benchmarks, released under an MIT license with a full technical paper. It wiped roughly $600B off NVIDIA’s market cap in a single day and forced every Western lab to explain, publicly, why their models cost so much more.

For developers learning AI, Liang’s work is the clearest demonstration that the frontier is not owned. R1 and its distillations run locally on consumer hardware, power cheap API endpoints, and underpin a growing chunk of the open-source stack. Whether you see him as a serious rival to the US labs or a strategic asset of the Chinese state — and the honest answer is probably both — DeepSeek’s weights are on Hugging Face, and they work.

Key Articles & Papers

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning 2025 — The paper that shook the industry. Shows pure RL without SFT can teach a base model to reason, with full recipe released. DeepSeek-V3 Technical Report 2024 — 671B-parameter MoE trained for a fraction of frontier costs. Introduces the training tricks that made R1 possible. DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model 2024 — Introduces Multi-head Latent Attention (MLA), the KV-cache compression trick that drove DeepSeek's cost advantage. DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models 2024 — Introduces GRPO, the RL algorithm later used to train R1. A quietly foundational paper. An Interview With The Chinese CEO Behind DeepSeek 2024 — ChinaTalk's translation of the rare long-form Liang interview with Chinese outlet 暗涌. The best primary source on his thinking. DeepSeek FAQ 2025 — Ben Thompson's cold-water analysis of what DeepSeek actually did, what it didn't, and why the panic was half right. DeepSeek-Coder: When the Large Language Model Meets Programming 2024 — The code model line that fed into V3 and R1. Strong open weights that still hold up for local coding assistants.

Controversies

Training data provenance. OpenAI and Microsoft publicly suggested DeepSeek may have distilled outputs from GPT-4 to bootstrap its models. DeepSeek has not directly addressed the claim; no hard evidence has been published. Ironic, given OpenAI’s own training-data history.
National security and CCP ties. Multiple Western governments (US, Italy, Taiwan, Australia, South Korea) have restricted or banned the DeepSeek app on government devices over data residency and censorship concerns. The hosted models refuse questions about Tiananmen, Xi Jinping, and Taiwan — the open weights, running locally, do not.
Export-control workarounds. Reporting suggests High-Flyer’s pre-2022 A100 stockpile, plus later H800 access, is how DeepSeek trained frontier models despite sanctions. This has reopened the debate over whether US chip controls actually work.