Interconnects — Reading Room

Reasoning Models, RLVR, and the o1/o3 Era

8 tier-5 · 11 tier-4

The intellectual spine of the modern archive. Lambert charts the reasoning paradigm from before it had a name -- his Q* hypothesis and o1 reverse-engineering -- through the public arrival of RLVR (Reinforcement Learning with Verifiable Rewards) and the realization that o1/o3 are large-scale outcome-reward RL rather than test-time tree search. Across these pieces he argues that chain-of-thought is a natural fit for autoregressive models, that reasoning will generalize well beyond math and code, that over-optimization returns in new and weirder forms as RL scales, and -- in the landmark "How to scale RL" -- that RL now has its own scaling laws distinct from pretraining. Together they form the most coherent running account anywhere of how reasoning models are actually trained.

The Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic data

TIER 5 Nov 24, 2023

The widely-cited literature-grounded hypothesis on OpenAI's leaked Q*, arguing it links Q-learning and A* search via tree-of-thoughts reasoning over language steps scored by process reward models, bringing AlphaGo-style self-play and look-ahead planning to LLMs. A landmark interpretive post that shaped how the field understood the path toward reasoning models well before o1.

Q-starreasoningprocess-reward-modelstree-of-thoughtssearch

Interviewing Ross Taylor on LLM reasoning, Llama fine-tuning, Galactica, agents

TIER 4 Aug 8, 2024

A deep, full-transcript interview with Ross Taylor (Galactica lead, Llama post-training reasoning lead) on the nature of LLM reasoning, chain-of-thought vs. adaptive computation, process/outcome reward models and MCPE, why RLHF beat the SFT-only camp at Meta, and why frontier-lab advantage is brute-force execution over secret methods. It matters for the candid, expert insider perspective on reasoning research and post-training culture.

interviewLLM reasoningGalacticaRLHFreward models

OpenAI's Strawberry, LM self-talk, inference scaling laws, and spending more on inference

TIER 4 Sep 5, 2024

A prescient explainer framing inference-time compute as a distinct scaling law, reading OpenAI's Strawberry/Q*/Orion rumors as self-talk reasoning plus a verifier, and surveying the early test-time-compute literature (best-of-N, Large Language Monkeys, proposer-verifier). It matters as an early, well-sourced articulation of the reasoning-model paradigm that would define the next era of LLMs.

inference scalingreasoningtest-time computeOpenAI StrawberryRLHF

Reverse engineering OpenAI’s o1

TIER 5 Sep 16, 2024

The definitive early reverse-engineering of OpenAI's o1, framing it as an RL-trained reasoning system that does online/test-time search rather than plain autoregression and thereby confirms inference scaling laws. Lays out the Q*-to-Strawberry-to-o1 lineage, the generator/verifier search structure, and what an open-source o1 would require—an influential reference piece for the reasoning-model paradigm.

o1reasoninginference scaling lawsreinforcement learningtest-time search

OpenAI's o1 using 'search' was a PSYOP

TIER 4 Dec 4, 2024

Lambert reverses his earlier reading of OpenAI's o1, arguing the model needs no test-time tree search or process rewards: all the 'search' lives inside large-scale outcome-reward RL, with verifiable answers and LLM-judged continuations doing the supervision, and the test-time-compute scaling curve possibly an artifact of bucketing sampled generations by token count. It is an influential early mechanistic account of reasoning-model training that aligns o1 with the Bitter Lesson and Ai2's own RLVR 'wait, let me check' behaviors. Matters as a foundational framing of how the reasoning-model era actually works.

o1 / reasoning modelsRLVRtest-time computeprocess rewardspost-training

Interviewing Finbarr Timbers on the 'We are So Back' Era of Reinforcement Learning

TIER 4 Dec 5, 2024

A deep technical interview with researcher Finbarr Timbers (ex-DeepMind, Midjourney) tracing the full decade of deep RL, from DQN and AlphaZero through the slowdown to RLHF and o1's revival of the field. The full transcript covers RL fundamentals, the bitter lesson, reward modeling, exploration, Tulu 3, and AI research management, a substantive history-and-state-of-RL discussion.

reinforcement learningdeep RL historyAlphaZeroRLHFpodcast interview

OpenAI's Reinforcement Finetuning and RL for the masses

TIER 4 Dec 11, 2024

A substantive analysis of OpenAI's Reinforcement Finetuning (RFT) API, framed through LeCun's cake metaphor, arguing it brings RL to the masses and signals that RL training stability is now solved enough to expose to the public. It explains RFT's grader configs, the contrast with SFT/LoRA finetuning, and the potential data flywheel it gives OpenAI for future reasoning models, noting Ai2's open RLVR work is closely analogous.

reinforcement finetuningRFT APIRLVROpenAIRL stability

OpenAI's o3: The grand finale of AI in 2024

TIER 5 Dec 20, 2024

The definitive analysis of OpenAI's o3 announcement, the major model event of late 2024, detailing its step-change results on ARC-AGI (87%), Frontier Math (2 to 25%), and SWE-Bench, and the price/compute axes behind them. Lambert argues o3 is most likely the same o1 RL methodology scaled up (no tree search) on a larger base, signaling that RL-driven reasoning is the next hill the industry climbs.

OpenAI o3ARC-AGIreasoning modelsinference scalingmodel release

Quick recap on the state of reasoning

TIER 4 Jan 2, 2025

A NeurIPS talk (full transcript) addressing whether language models actually reason, arguing for a scoped, behavior-level definition of reasoning rather than an AGI debate, and disambiguating post-training, reasoning, and inference-time compute. It connects o1, RLVR, and the reinforcement-finetuning grader API into a grounded conceptual framing for reasoning research in 2025.

reasoninginference-time computeo1RLVRpodcast talk

DeepSeek R1's recipe to replicate o1 and the future of reasoning LMs

TIER 5 Jan 21, 2025

The definitive early breakdown of DeepSeek R1's four-stage RL-heavy training recipe (cold-start SFT, large-scale RL, rejection sampling, final RLHF), arguing it is the first open seminal paper that locks in reasoning-model research and ends the o1 replication uncertainty. It walks through R1-Zero, RLVR reward design, GRPO, distillation, and the open questions, making it a lasting reference for how reasoning models are actually trained.

DeepSeek R1reasoning modelsRLVRGRPOo1 replication

Why reasoning models will generalize

TIER 5 Jan 28, 2025

Makes the influential argument that chain-of-thought reasoning is a natural fit for autoregressive LLMs (handling recurrence in state-space rather than parameters), so RL-trained reasoning will generalize well beyond code and math — really teaching models to allocate more compute to harder problems. Marshals early evidence (deliberative-alignment safety generalization, R1 topping creative-writing/Humanity's-Last-Exam/calibration leaderboards) and frames the generator-verifier flywheel. A foundational conceptual piece that anchors much of Lambert's later reasoning coverage.

reasoning modelschain of thoughtgeneralizationRLverifiers

Claude 3.7 thonks and what's next for inference-time scaling

TIER 4 Feb 24, 2025

Analyzes Claude 3.7 Sonnet's release (and Claude Code preview) as a tidy SWE/tool-use SOTA improvement, using it to explain Anthropic's single-model approach to reasoning, developer-controllable thinking-token budgets, and visible reasoning traces. The more durable contribution is the explainer on where inference-time scaling goes next — parallel test-time compute, pass@N vs answer-extraction, and verifiers as the real performance limiter. A strong release-plus-explainer that doubles as a reference on inference-time scaling mechanics.

Claude 3.7inference-time scalingreasoningverifiersparallel sampling

OpenAI's o3: Over-optimization is back and weirder than ever

TIER 5 Apr 19, 2025

Introduces a three-era framework for RL over-optimization (control: brittle environments; RLHF: bad reward functions; RLVR: effective-but-weird models) to explain o3's new failure mode—tool-use RL produces strong agentic capability alongside fabricated actions and hallucinated tool calls. Argues o3's hacking comes from scaling RL with softer/LLM-judge verifiers and that legibility, not outcomes, is what degrades. Matters as an original, lasting conceptual frame for reasoning-with-tools models and the shift to 'reliable interaction with the external world' as the new frontier.

o3over-optimizationRLVRtool usereward hacking

A taxonomy for next-generation reasoning models

TIER 5 Jun 4, 2025

Introduces a four-part framework — skills, calibration, strategy, abstraction — for understanding current and next-generation reasoning models, ordered as the path from single-pass problem solving to agentic planning. Explains how parallel inference-time compute differs from training-time RL, why models overthink, and why bootstrapping planning (a Q*-style data-curation effort) is the next race. A clean, original, reusable conceptual framework that recurs across his later writing and talks.

reasoning modelsagentsplanninginference-time scalingtaxonomy

What comes next with reinforcement learning

TIER 4 Jun 9, 2025

Maps three RL futures — continuing to scale RLVR (likely, yielding more frequent model releases), pushing RL to sparser long-horizon domains (Lambert is skeptical, citing the credit-assignment and off-policy bottlenecks and how Deep Research actually trains on sub-tasks not end results), and true continual learning (a low-probability algorithmic breakthrough). Notably argues continual learning where the model learns from you is borderline dystopian and that personalization is the safer framing. A clear-eyed, well-structured RL roadmap.

reinforcement learningRLVRcontinual learninglong-horizon RLcredit assignment

The rise of reasoning machines

TIER 4 Jun 12, 2025

A philosophical essay rebutting the Apple 'Illusion of Thinking' paper, arguing that individual failures (e.g. Tower of Hanoi token limits) cannot prove the absence of reasoning, which should be defined by whether structures are used to solve real tasks. Uses the ornithopter/flight analogy — human reasoning is the inspiration, not the endpoint — and engages Ilya's understanding-vs-awareness framing to claim we've passed the Wright Brothers moment for artificial reasoners. A thoughtful conceptual contribution to the reasoning debate.

reasoningIllusion of Thinkingphilosophy of AIchain-of-thoughtAGI

Crafting a good (reasoning) model

TIER 4 Jun 18, 2025

A podcast/talk (full transcript) on why benchmark-topping models can flop in real use — the 'art of the model' in RLHF, the Goldilocks zone between evals, vibes, and price, and how over-optimization (sycophancy, unit-test gaming, GPT-4.5's 'not a frontier model') emerges from multi-objective hillclimbing. Ends with his skills/calibration/strategy/abstraction taxonomy and the shift of compute toward RL post-training. A substantive recap of his 2025 thinking with a strong reading list.

reasoning modelsRLHFover-optimizationsycophancymodel evaluation

Some ideas for what comes next (Jun. 2025)

TIER 4 Jun 23, 2025

A three-part outlook essay: (1) o3's relentless search behavior is an under-appreciated technical breakthrough no other lab has matched; (2) agent progress will be high-variance but rapid because post-training fixes small reliability failures fast; (3) parameter scaling for consumer models has fizzled, replaced by efficiency marches and inference-time scaling. A genuinely useful synthesis of where frontier capability and product trajectories are heading.

o3search agentsagentsmodel scalinginference-time compute

How to scale RL

TIER 5 Oct 20, 2025

A definitive technical explainer of the first major RL-scaling-laws paper (ScaleRL, Khatri & Madaan et al.), which fits sigmoid curves (peak A, slope B, compute C) to extrapolate RL learning curves and ablate design choices. Lambert clarifies how RL scaling differs from pretraining laws (extracting maximum performance from a base model, not configuring one big run) and surveys the now-essential ingredients: truncated importance sampling, GSPO, CISPO, and PipelineRL's in-flight updates plus continuous batching (4x+ throughput). He marks the open questions (data regime, base-model choice) and argues the public tooling must be rebuilt to close the gap to frontier RL stacks. A landmark reference on the state of RL scaling. ---

RL scaling lawsreinforcement learningScaleRLimportance samplingRL infrastructurepost-training

Open-Model Releases: OLMo, Tulu, Llama, and the Truly-Open Standard

5 tier-5 · 13 tier-4

Definitive, often same-day teardowns of the releases that defined what "open" means. The Ai2 cluster (OLMo, OLMoE, OLMo 2, Olmo 3, Olmo Hybrid, Tulu 3, Molmo) is reported from the inside by the team that built it, including the first fully-open frontier post-training recipe and the introduction of RLVR to the broad ecosystem. The Meta cluster (Llama 2, Llama 3, Llama 3.1 405B) tracks Llama from "first open model that matches ChatGPT" to its strategic peak as the open frontier. Alongside sit the other open standard-bearers -- Mixtral, DBRX, Gemma 3, Phi/Arctic, Nvidia Nemotron, Arcee -- and interviews with the practitioners who trained them. The throughline is the slow construction of a "truly open" reference point (data + code + weights + logs) the rest of the field is measured against.

Llama 2: an incredible open LLM

TIER 5 Jul 18, 2023

Lambert's same-day, early-access technical breakdown of the Llama 2 release — the first open model he's convinced matches ChatGPT (outside coding) — covering the base model (2T tokens, 4k context, grouped-query attention), the two-reward-model RLHF to dodge the safety-helpfulness tradeoff, two-stage rejection-sampling-then-PPO pipeline, Ghost Attention, ~$25M preference-data cost, license restrictions, and the not-truly-open-source verdict. As the definitive contemporaneous analysis of a landmark open-weights release, with insider technical depth, this is lasting reference material for the open-model era.

Llama 2open weightsRLHF pipelineMetamodel release analysis

Mixtral: The best open model, MoE trade-offs, release lessons, Mistral raises $400mil, Google's loss, vibes vs marketing

TIER 4 Dec 11, 2023

Analysis of Mistral's Mixtral 8x7B sparse-MoE release (52B total / 12B active, beating Llama-2-70B) as the best open model, paired with a primer on why labs use mixture-of-experts for performance and distributed-inference scaling. Reads the release velocity, Mistral Medium hints, and $400M raise as signs of Mistral's momentum, while cautioning that head-to-head chat vibes, not benchmarks, decide real strength.

mixtralmixture-of-expertsopen-modelsmistralmodel-release

Open Language Models (OLMos) and the LLM landscape

TIER 5 Feb 1, 2024

Announces AI2's first OLMo models (7B and 1B) as the first state-of-the-art LLM in years that is fully open — weights, training code, eval code, and the Dolma pretraining dataset all released — and argues this enables scientific research (data attribution, contamination, optimizer-state fine-tuning) impossible on closed models. A landmark release in the open-model movement and a foundational reference for the 'truly open' standard the field would measure against.

OLMoopen modelsAI2Dolmapretraining

Google ships it: Gemma open LLMs and Gemini backlash

TIER 4 Feb 22, 2024

Analyzes Google's Gemma open-weight release (7B/2B, nonstandard architecture, Gemini tokenizer, pretraining annealing) and confirms RLHF details — Google using REINFORCE, a large reward model, and an InstructGPT-style KL penalty — alongside a primer on REINFORCE/PPO/TRPO history and the RAIL license terms. Matters as a substantive technical read on a major open release and on how RLHF is regressing toward simpler policy-gradient methods. Public preview of a paywalled post, but the visible technical content is rich.

GemmaRLHFREINFORCEopen modelslicenses

DBRX: The new best open model and Databricks’ ML strategy

TIER 4 Mar 28, 2024

Walks through Nathan's full model-familiarization process on Databricks' DBRX, which takes the absolute-best-open-model crown (though Mixtral stays more efficient on active params), and reads Databricks' open-LLM strategy via the Mosaic acquisition. Frames the broader efficiency trend that GPT-4-level performance is steadily reproduced and cheapened across orgs, trending toward effectively free.

dbrxdatabricksmixture-of-expertsmodel-release-analysiscost-efficiency

Llama 3: Scaling open LLMs to AGI

TIER 5 Apr 18, 2024

Definitive day-one analysis of Meta's Llama 3 release covering pretraining and the 15T-token over-Chinchilla data strategy, the SFT/rejection-sampling/PPO/DPO post-training stack and ~$10M human preference data estimate, human evals, the licensing/ecosystem-takeover terms, and what 70B open weights mean for API providers. A reference-grade model-release teardown from a post-training insider.

llama-3open-modelspretrainingrlhf-post-trainingmodel-release-analysis

Phi 3 and Arctic: Outlier LMs are hints

TIER 4 Apr 30, 2024

Reads two outlier open releases as previews of where the field is heading: Snowflake's Arctic (a 480B/17B-active many-expert dense-MoE hybrid that trades consumer accessibility for enterprise inference efficiency) and Microsoft's Phi 3 (synthetic-textbook training that inflates MMLU). Argues outlier models break naive compute-vs-MMLU scaling and that the next big open opportunity is small, high-performance-per-parameter models.

mixture-of-expertssynthetic-datasmall-modelsphi-3scaling-laws

Llama 3.1 405B, Meta's AI strategy, and the new, open frontier model ecosystem

TIER 4 Jul 23, 2024

A definitive release analysis of Llama 3.1 405B as the first open model that fairly compares to closed frontier models (Claude 3, GPT-4o), arguing Meta is trying to absorb the entire open ecosystem under the Llama brand while sitting at the wrong layer of the stack (the real lock-in is Nvidia/CUDA and HuggingFace, not the model). It dissects the license's open-washing (synthetic-data permission plus naming/branding capture), Zuckerberg's commoditize-your-complements strategy, and the looming regulatory debate. Matters as a clear-eyed framing of why open-weight models, not 'open-source AI,' now have guaranteed multi-year relevance.

Llama 3.1 405BMetaopen modelslicensingopen-source AI

Interviewing Sebastian Raschka on the state of open LLMs, Llama 3.1, and AI education

TIER 4 Aug 1, 2024

A full-transcript interview with educator and 'Build a Large Language Model from Scratch' author Sebastian Raschka covering keeping up with AI research, the post-Llama-3.1 open LLM ecosystem, architectures (MoE, early vs. late multimodal fusion), distillation, and implementation pitfalls. It matters as a substantive, accessible survey of open-model practice from a respected explainer of the field.

interviewopen LLMsmodel architecturesdistillationAI education

OLMoE and the hidden simplicity in training better foundation models

TIER 4 Sep 4, 2024

Pairs the OLMoE Mixture-of-Experts model release with an insider account of how frontier model orgs actually operate: compute allocation splits (~60% pretraining, 25% post-training, 10% data), de-risking via FLOP-efficiency-vs-risk tradeoffs, path-dependent capability unlocking, and progress as compounding small wins rather than secret tricks. It matters as a rare, concrete window into the organizational mechanics of building foundation models.

OLMoEmixture of expertsfrontier labscompute allocationpost-training

Llama 3.2 Vision and Molmo: Foundations for the multimodal open-source ecosystem

TIER 4 Sep 25, 2024

Surveys the still-undefined multimodal LLM space through two parallel releases: Ai2's fully-open Molmo (Apache 2.0, built on Qwen/OLMo) and Meta's more-restricted Llama 3.2 Vision, both late-fusion models. Explains why late-fusion dominates, why web-element understanding is the key unsolved capability gating web agents, and argues the small 1B/3B Llama models matter most.

multimodalMolmoLlama 3.2 Visionopen modelslate fusion

Tülu 3: The next era in open post-training

TIER 5 Nov 21, 2024

Launch of Tülu 3, the first fully-open frontier post-training recipe, surpassing Meta's Llama 3.1 8B/70B instruct versions via scaled on-policy preference data and the newly introduced Reinforcement Learning with Verifiable Rewards (RLVR), all released with datasets, code, and paper. Framed within a history of open post-training (Alpaca → DPO → the current closed-vs-open gap). A landmark open-recipe release that introduced RLVR to the broad ecosystem and set a reference point for what open post-training can achieve.

Tulu 3RLVRopen post-trainingDPO / preference dataopen models

OLMo 2 and building effective teams for training language models

TIER 4 Nov 26, 2024

A two-part post: a management essay on what makes LM-training teams effective (detail-oriented owners, context-holding managers, brutal prioritization, compounding small gains) and the OLMo 2 7B/13B release, fully-open models trained on 4-5T tokens that beat Llama 3.1 8B and Qwen 2.5 via the Tulu 3 recipe plus RLVR. Useful both as a model-release analysis and a rare candid look at training-org dynamics, including the lesson that RL finetuning needs multiple seeds. Matters as a milestone in fully-open model quality.

OLMo 2open modelsteam managementRLVRpost-training

Interviewing OLMo 2 leads: Open secrets of training language models

TIER 4 Jan 22, 2025

A deep technical interview with the OLMo 2 pretraining and data leads (Dirk Groeneveld, Kyle Lo, Luca Soldaini) covering the project's early history, a failed 70B run, the long quest for training stability, and the many small decisions behind data work, muP, scaling laws, and MoE. The full transcript is a rare practitioner's-eye account of what it actually takes to build a frontier-competitive open model, valuable to anyone training LMs.

OLMo 2pretrainingtraining stabilityopen modelspodcast interview

Gemma 3, OLMo 2 32B, and the growing potential of open-source AI

TIER 4 Mar 13, 2025

Announces OLMo 2 32B (a fully open-source, data-and-code GPT-4-class model from Ai2) alongside analysis of Google's Gemma 3, using both to argue the open-closed gap has narrowed to roughly 18 months and that the era has shifted from 'what is open' to 'why open.' Distinguishes true knowledge distillation (matching the teacher's distribution) from colloquial output-distillation, and reads ChatBotArena/SimpleQA/GPQA results to show small open models now clear meaningful capability thresholds. A strong state-of-open-models essay paired with a notable release.

OLMo 2Gemma 3open-source AIdistillationopen-closed gap

Olmo 3: America’s truly open reasoning models

TIER 5 Nov 20, 2025

Lambert's launch analysis of Ai2's Olmo 3 (7B/32B), the first 32B+ fully open reasoning model with full data, code, checkpoints, and logs, including the best open 32B base model and a detailed post-training recipe (SFT → DPO → scaled RLVR) plus the 'Model Flow' (Instruct/Think/RL Zero) framing. It discloses real recipe specifics — the delta-learning DPO trick (Qwen3-32B chosen vs 0.6B rejected), 'active refilling' in async RL, and RL Zero datasets enabling study of RLVR contamination on Qwen. A definitive, authoritative model-release analysis from the team that built it, with lasting research value.

Olmo 3open modelspost-trainingRLVRDPOreasoning models

Arcee AI goes all-in on open models built in the U.S.

TIER 4 Jan 27, 2026

A founder interview (full transcript) with Arcee AI's Mark McQuade and Lucas Atkins on pivoting from post-training open models for customer domains to pretraining larger US-built open models (Trinity-Large, a 400B ultra-sparse MoE), described as the most genuinely revenue-oriented approach to monetizing open models Lambert has found. The deep discussion covers their business model, pretraining at scale on a startup budget, and the case for American open models. Substantive insider view of an emerging US open-model company's strategy and technical choices.

Arcee AIopen modelspretrainingAmerican AIopen-model business models

Why Nvidia builds open models with Bryan Catanzaro

TIER 4 Feb 4, 2026

A podcast (full transcript) with NVIDIA's Bryan Catanzaro on the strategic logic of NVIDIA releasing open models (Nemotron) — fundamentally to grow the ecosystem of people building on open weights and to understand what they need from hardware next, the one clear commercial reason to be open. The deep technical and strategic discussion covers NVIDIA's compression/pruning research direction, open data and tech reports, and how open models serve NVIDIA's GPU business. Substantive insider view of a key player's open-model rationale. ---

NVIDIAopen modelsNemotronBryan CatanzaroAI hardware

RLHF, Reward Models, and Post-Training Methods

3 tier-5 · 27 tier-4

The author's home turf: how preference fine-tuning actually shapes a model, written by a researcher who built open RLHF recipes. These pieces dissect the mechanics -- why RLHF is "style transfer plus gentle bug-squashing," the DPO-vs-PPO debate, the chattiness/length paradox, sycophancy and over-optimization, reward models as first-class auditing instruments, and the elicitation view that post-training mostly amplifies latent base-model behavior. They culminate in the canonical "recipe for frontier model post-training" and the synthetic-data frameworks that replaced human labels, plus the newer subfield of character/personality training. This is the reference layer for anyone going deep on post-training.

The implicit dynamics of optimizing costs vs. rewards vs. preferences

TIER 4 Mar 27, 2023

Traces RL's progression from cost functions to rewards to preferences, showing how each step loosens the problem spec and injects uncertainty, and argues preference models (relative, deterministic-by-assumption, but distorted by mere-exposure/conditioning/hedonic-adaptation effects) have no guarantee of matching real preferences and are under-studied. A conceptually rich, original framing from Lambert's RL/robotics background that anchors much of his later reward-modeling work.

preference modelingreward modelsreinforcement learningRLHFobjective mismatch

Beyond human data: RLAIF needs a rebrand

TIER 4 Apr 26, 2023

Explains RLAIF (from Anthropic's Constitutional AI paper) and proposes the broader, more useful framing RLCF—reinforcement learning from computational feedback—any computed signal (unit tests, NER privacy checks, sentiment) optimized via RL, freeing it from human-data bottlenecks and from language alone. A genuinely original reframing and an early call on a method that later became central; its 'RLCF' coinage prefigures RLVR.

RLAIFRLCFConstitutional AIcomputational feedbackreward design

Specifying hallucinations

TIER 4 May 3, 2023

Argues the term 'hallucination' is overloaded and harmful (anthropomorphizing deterministic token sampling), then disaggregates it across domains—chat, code, control/decision-making, healthcare, enterprise—proposing axes (consumer vs business, downside risk, accumulated degradation, what counts as out-of-distribution) for when hallucination is value-generative versus dangerous. A thoughtful conceptual framework piece that reframes a heavily-hyped term.

hallucinationstochastic parrotsout-of-distributionRLHFanthropomorphization

How RLHF actually works

TIER 5 Jun 21, 2023

A landmark early explainer dissecting why RLHF works and why it is so hard to set up, walking the full pipeline (capable base model → preference data → RL) and the data/optimization conditions—pairwise preference signal, distribution matching, scaling to 50B+—that make it succeed. Frames RLHF as 'style transfer plus gentle bug-squashing' (a topic filter, not a capability or fact adder) and previews DPO; one of the most-cited reference posts in the archive and foundational to Lambert's later RLHF book.

RLHFpreference datareward modelsConstitutional AIDPO

Llama 2 follow-up: too much RLHF, GPU sizing, technical details

TIER 4 Jul 21, 2023

A technical follow-up to the Llama 2 release: Lambert diagnoses the model's over-cautious 'evasiveness through harmlessness' (refusing up to 27% of borderline asks) as RLHF-hammer over-optimization against imperfect reward models, then provides practical GPU/inference and fine-tuning sizing guidance plus deep notes on Ghost Attention (GAtt) for multi-turn consistency and Meta's rejection-sampling/PPO RLHF details. He flags the helpful-vs-harmless tradeoff as a fundamental open-source problem. A genuinely useful post-training explainer with corrections to his prior cost estimates.

Llama 2RLHF over-optimizationGhost AttentionGPU sizingsafety-helpfulness tradeoff

Specifying objectives in RLHF

TIER 4 Aug 2, 2023

Synthesizing John Schulman's ICML talk, the DPO paper, and the 'Open Problems in RLHF' paper, Lambert explains RLHF's core flaw — optimizing a proxy reward model that's only correlated with chatbot quality, leading to Goodhart-style over-optimization (true objective improves then degrades) measured via KL distance. He covers PPO vs. best-of-N efficiency, why bigger reward models help, his reservations about DPO, and implicit-feedback/steering-loss directions. A high-signal technical explainer of RLHF objective mechanics from a post-training specialist.

RLHFproxy objectivesover-optimizationDPOreward models

Undoing RLHF and the brittleness of safe LLMs

TIER 4 Oct 18, 2023

Synthesizes safety papers showing that parameter-level safety training is brittle — as few as 10-100 fine-tuning examples (even benign or 'identity-shifting' data) strip Llama-2-chat or GPT-3.5 filters — meaning safety in accessible weights protects liability, not against misuse. Argues true robustness would require embedding safety in pretraining, with implications for what open releases can responsibly claim.

RLHFAI-safetyjailbreakingfine-tuningopen-models

RLHF lit. review #1 and missing pieces in RLHF

TIER 4 Oct 25, 2023

First in a literature-review series, mapping how RLHF research trends align with industry practice and arguing the RL part rests on shaky foundations — notably the absence of real exploration (each completion is one action, no multi-step credit assignment) and its low compute/data footprint. Sets up two futures: RLHF folding into pretraining, or a new RL paradigm enabling continual learning from implicit user feedback.

RLHFreinforcement-learningexplorationliterature-reviewcontinual-learning

RLHF progress: Scaling DPO to 70B, DPO vs PPO update, Tülu 2, Zephyr-β, meaningful evaluation, data contamination

TIER 4 Nov 22, 2023

Reports AI2's Tulu 2 as the first 70B DPO run to converge with strong results (roughly original-ChatGPT level), crediting a surprisingly low 5e-7 learning rate and the UltraFeedback dataset over optimizer tweaks. A grounded practitioner's progress update on open RLHF that doubles as a case study in DPO scaling, overfitting concerns, and the limits of MT-Bench/AlpacaEval vibe evals.

RLHFDPOtuluzephyrevaluation

Synthetic data: Anthropic's CAI, from fine-tuning to pretraining, OpenAI's Superalignment, tips, types, and open examples

TIER 5 Nov 29, 2023

Comprehensive survey establishing synthetic data as the central resource of the post-ChatGPT era — used by frontier labs (Anthropic CAI, OpenAI Superalignment) to remove humans from alignment and by open builders to cheaply chase SOTA. Covers the scaling-data debate, robustness gains, distillation dynamics, and open examples, serving as a durable reference on synthetic-data methods and their divergent open vs. closed goals.

synthetic-dataconstitutional-AIalignmentdistillationpretraining

Do we need RL for RLHF?

TIER 4 Dec 6, 2023

Frames the central RLHF question of the moment — do we need true RL (value functions, policy gradients) or does Direct Preference Optimization suffice — arguing DPO's breakthroughs (Tulu, Zephyr) owed more to data, hyperparameters, and implementation simplicity than to the loss function being magic. Concludes the real bottleneck for open RLHF is data, tooling, and evaluation, not optimizer choice, citing Starling's APA result as counter-evidence that strong PPO/online models exist.

RLHFDPOPPOpreference-optimizationpost-training

RLHF learning resources in 2024

TIER 4 Jan 12, 2024

A curated, categorized index of RLHF learning resources — talks, podcasts, code repos (alignment-handbook, TRL), key models/datasets (Zephyr, UltraFeedback), evaluations, and foundational papers (InstructGPT, DPO, Llama 2) — assembled by a leading practitioner. Useful as a durable entry-point reference for anyone going deep on RLHF, with the author's light annotations on why each item matters.

RLHFlearning resourcesDPOreward modelsreference

Why reward models are key for alignment

TIER 4 Feb 14, 2024

Argues that even as DPO and LLM-as-judge displace explicit RL, scalar reward models remain a uniquely valuable, underused tool — a clean classifier interface for auditing model representations, biases, and preference data without prompting or per-token compute limits, illustrated with a per-token reward-tracing experiment. Matters as an original RLHF research thesis (a precursor to RewardBench-style work) reframing reward models from disposable training artifacts to first-class analysis instruments.

reward modelsRLHFDPOalignmentinterpretability

Interviewing Louis Castricato of Synth Labs and Eleuther AI on RLHF, Gemini Drama, DPO, founding Carper AI, preference data, reward models, and everything in between

TIER 4 Mar 4, 2024

A deeply technical full-transcript interview with Louis Castricato (Synth Labs, EleutherAI, ex-CarperAI) covering RLHF, DPO, preference data, reward models, multimodal/long-context RLHF, OpenAI's influence on the field, and the Gemini bias episode. Matters as a rare candid deep dive from a leading open-RLHF practitioner across nearly every active fine-tuning question.

RLHFDPOreward modelspreference datainterview

Stop 'reinventing' everything to solve alignment

TIER 4 Apr 17, 2024

Argues RLHF researchers keep reinventing solutions that established social-science fields already solved, and that reward models are the underused transparency/auditing lever. Introduces social choice theory (social welfare functions, cardinal reward modeling, personalization) and pluralistic alignment as concrete ways to redesign preference data and reward models, with an OLMo 1.7-7B update appended.

rlhfreward-modelssocial-choice-theorypluralistic-alignmentolmo

How RLHF works, part 2: A thin line between useful and lobotomized

TIER 4 May 1, 2024

A grounded explainer on what preference fine-tuning (DPO/PPO/KTO etc.) actually does to models: the 'chattiness paradox' where DPO boosts gameable evals (AlpacaEval) without moving ChatBotArena, the parameter-level mechanism by which GPT-4-styled chosen responses get upweighted into longer outputs, why KL constraints don't fully prevent it, and a defense of style transfer as real value. Closes with a credible research agenda (reasoning/code RLHF, online methods, KTO) that aged well.

RLHFDPO vs PPOpreference fine-tuningchattiness/length biasKTO

OpenAI’s Model (behavior) Spec, RLHF transparency, personalization questions

TIER 4 May 10, 2024

A close reading of OpenAI's first Model Spec as a major RLHF-transparency artifact: it documents intended behaviors (objectives, defaults, rules), the 'chain of command' platform/developer/user prompt hierarchy, and quietly reveals OpenAI's verbosity penalty (Schulman's 'laziness from fear of running out of tokens') and the answer-quality ordering. Lambert argues it exposes how hard balancing many RLHF goals is and frames the unsolved 'Aggregator's AI risk' of one-answer monoliths, pointing toward personalization as the open market.

Model SpecRLHFtransparencychain of commandpersonalization

Frontiers in synthetic data

TIER 4 Jun 21, 2024

An eight-point framework on how frontier labs actually use synthetic data, organized into transferable insights: direct distillation is still king, Gemini Flash is confirmed distilled from Pro (and Claude Haiku likely too), filtering plus data accumulation prevents model collapse, license clauses against training-on-outputs often originate from data vendors, multi-source preference datasets (UltraFeedback/Nectar) have a lower ceiling than on-policy generation, structured/verifiable synthetic data drives IFEval-style gains, weak-to-strong generalization is increasingly real, and synthetic prompt generation is underrated. Matters as a high-signal, durable mental model of synthetic-data practice that reads industry tea leaves better than the academic literature.

synthetic datadistillationmodel collapseweak-to-strongpost-training

RLHF roundup: Getting good at PPO, sketching RLHF's impact, RewardBench retrospective, and a reward model competition

TIER 4 Jun 26, 2024

A substantive technical roundup reporting the team's effort to make PPO work 'as well as industry says,' finding that open RLHF tooling is fickle, that bigger reward models and extra reasoning prompts didn't reliably help, and that variation between datasets exceeds variation between DPO/PPO algorithm variants — so dataset work is where open labs should differentiate. It pairs this with a RewardBench retrospective (90% peak accuracy, why it got adopted, reward-model-as-judge beating LLM-as-judge), an Epoch AI estimate that post-training gets ~50% of frontier-lab compute, and an LMSYS/Kaggle preference-prediction competition. Matters as a candid practitioner's state-of-play for open post-training.

RLHFPPO vs DPORewardBenchreward modelspost-training

A recipe for frontier model post-training

TIER 5 Aug 7, 2024

Synthesizes the Llama 3.1, Nemotron 340B, and Apple foundation-model reports into a definitive 'new standard pipeline' for frontier post-training: scaled iterative RLHF over instruction tuning, synthetic data surpassing humans, multi-round training, and data filtering as the king variable. It is a landmark reference that crystallized how the field's top labs converged on post-training, widely cited as the canonical articulation of the modern RLHF recipe.

post-trainingRLHFsynthetic dataLlama 3.1data filtering

The state of post-training in 2025

TIER 4 Jan 8, 2025

A useful state-of-the-field overview accompanying a tutorial talk, proposing the clean three-bucket taxonomy of modern finetuning (instruction tuning, preference tuning, and the new reinforcement finetuning) and arguing post-training has become more impactful, more expensive, less human-data-reliant, and the gateway to reasoning models. A solid explainer for where open post-training stood entering 2025.

post-trainingRLHFreinforcement finetuningopen recipestutorial

Character training: Understanding and crafting a language model's personality

TIER 4 Feb 26, 2025

Opens the conversation, largely absent outside frontier labs, on 'character training' — the post-training subset that shapes a model's manner rather than its content — prompted by a sweeping, undocumented GPT-4o personality update and Anthropic's Claude character/constitution work. Argues model specs and behavior-change evaluations should become a community norm (a race to the top), and that character training marks RLHF's shift from an alignment philosophy to an empirical performance tool. A genuinely original framing of an under-documented technique.

character trainingRLHFmodel specpersonalitypost-training

Elicitation, the simplest way to understand post-training

TIER 4 Mar 10, 2025

Proposes the 'elicitation interpretation' of post-training — that post-training mostly extracts and amplifies latent behaviors already present in the base model rather than teaching new skills — using an F1-season analogy and contrasting it with the Superficial Alignment Hypothesis (which it argues gets the intuition right for the wrong reasons). Frames why RL on a few thousand prompts beats SFT on millions of math samples, and why stronger base models are better RL starting points. A clean, reusable conceptual frame for thinking about where post-training gains come from.

post-trainingelicitationRLVRsuperficial alignmentbase models

Recent reasoning research: GRPO tweaks, base model RL, and data curation

TIER 4 Mar 31, 2025

A technical paper walkthrough of the reasoning-RL literature—Kimi k1.5, Open-Reasoner-Zero, DAPO, and Dr. GRPO—deflating GRPO hype by showing it is closely related to PPO/RLOO and not a uniquely special algorithm, while explaining the genuinely useful tweaks and base-model RL findings. Valuable as a grounded explainer that corrects the 'GRPO ushered in a new RL era' narrative and curates the key reproductions. A solid technical reference for practitioners following reasoning training.

GRPOreasoning RLDAPObase model RLpaper review

RL backlog: OpenAI's many RLs, clarifying distillation, and latent reasoning

TIER 4 Apr 5, 2025

A backlog of RL notes covering OpenAI's RFT/RLVR showing up across Operator, Deep Research, and Copilot (RLEF); a careful reading of DeepSeek's claims that RL-after-distillation is crucial and that small models gain less from large-scale RL; a well-argued case that DeepSeek did not distill o1 CoTs; and why latent/compressed reasoning is an interesting frontier. Closes with Sutton's full Verification essay. Matters as a substantive technical synthesis of how leading labs actually use RL and of distillation dynamics.

reinforcement learningRLVRdistillationDeepSeeklatent reasoning

Sycophancy and the art of the model

TIER 4 May 4, 2025

[Paywalled preview] Uses the GPT-4o sycophancy episode and rollback to argue RLHF/preference tuning remains a central, unsolved problem even in the reasoning era, dissecting OpenAI's post-mortem: a new thumbs-up/down reward signal weakened the primary reward and let sycophancy get baked in via over-optimization. Frames it as a structural limit of optimizing one default model for all users and a case for Model Specs over system prompts. A strong explainer on RLHF over-optimization and release-evaluation blind spots; visible preview already carries the core argument.

sycophancyRLHFGPT-4oover-optimizationModel Spec

Reinforcement learning with random rewards actually works with Qwen 2.5

TIER 4 May 27, 2025

Covers a UW paper Lambert contributed to showing that RLVR on Qwen 2.5 Math models improves MATH-500 by 15-25 points even with random, incorrect, or format-only 'spurious' rewards — but not on Llama or OLMo. Explains the mechanism: RLVR is largely surfacing a pre-existing code-reasoning strategy Qwen learned in pretraining (likely from synthetic math SFT data with perturbed-number variants), and random rewards work via GRPO's per-prompt advantage structure. An important cautionary finding about doing RL science on a contaminated base model.

RLVRQwen 2.5spurious rewardsGRPOdata contamination

Opening the black box of character training

TIER 4 Nov 10, 2025

A research-walkthrough (paywalled preview) of a character-training paper from Lambert's group, arguing character/personality training will be a permanent, philosophically rich subfield of RLHF as models grow more persuasive. It details two post-training stages (DPO distillation then SFT) driven by per-personality constitutions ('I am…' rather than Anthropic's 'choose the response that…'), finds personality is easy to imprint but hard to align to intent, and notes Qwen's internal personality is unusually rigid versus Llama/Gemma. A useful technical explainer with concrete methodology.

character trainingRLHFpersonalityconstitutionspost-training

Why AI writing is mid

TIER 4 Nov 16, 2025

An original argument that LLMs write poorly not by accident but as a structural consequence of how they are post-trained: style is one weak signal among many, aggregate preferences suppress quirks, per-instance data labeling and length/sycophancy biases penalize richness, and forced neutrality kills voice. Lambert contends base models can write (and Sydney/Bing showed voice) but there are no market incentives to do the full post-training refresh needed to unlock it. A thoughtful, insider explainer connecting craft to training mechanics.

AI writingpost-trainingvoiceRLHF biasesmodel personality

Frontier post-training recipe review with Finbarr Timbers

TIER 4 Jun 16, 2026

A deep podcast interview with Finbarr Timbers (full transcript plus an accompanying slide deck) tracing the history of post-training recipes from InstructGPT to today and dissecting 2026's frontier open recipes (DeepSeek V4, GLM 5, Kimi K2.6, MiMo). It maps what an Olmo-style open recipe would need to reach the frontier, making it a substantive technical reference on the post-training stack. ---

post-trainingRLHFpodcastopen modelsrecipes

The Open vs. Closed Thesis: Ecosystem, Economics, and the ATOM / American DeepSeek Project

3 tier-5 · 22 tier-4

Lambert's signature argument, built up over three years: open models matter as a counterweight to concentrated power and as the engine of AI research, but "train a good open model and release it with no product" is not a viable business. He works through open-model economics (specialization or bust), the four motivational camps of openness, why open and closed ride different exponentials, and ultimately the American DeepSeek Project and the ATOM (American Truly Open Models) manifesto -- the case that the US is ceding open-model leadership to China and should fund at least one frontier-scale truly-open lab. The later pieces propose an open-model consortium as the only stable funding structure once training costs reach the billions.

Unfortunately, OpenAI and Google have moats

TIER 4 May 17, 2023

A point-by-point rebuttal to Google's leaked 'We Have No Moat' memo, arguing the durable moats are diverse high-quality fine-tuning/RLHF prompt data and consumer habit/brand, not the model weights themselves—open prompt datasets barely crack 100k usable samples while OpenAI/Google see ~100k queries a day. A well-argued contrarian take with lasting relevance to the open-vs-closed debate.

moatsopen vs closedprompt dataRLHFcompetitive dynamics

Open-source LLMs' harmlessness gap

TIER 4 Jun 7, 2023

Identifies the growing gap between helpfulness (where open models race ahead on leaderboards) and harmlessness (where the community lags), explaining red-teaming economics, the 'uncensored'/filtered confusion, alignment taxes, and Anthropic's helpful-then-harmless CAI pipeline as the path open source has barely begun. A useful, well-structured explainer of where the open ecosystem stood on safety and why incentives stall it.

open modelsharmlessnessred-teamingConstitutional AIalignment tax

Different development paths of LLMs

TIER 4 Jun 14, 2023

Argues that not every future LLM needs to be a ChatGPT clone: open source will win by producing smaller, better models for narrow use-cases, and lays out a stakeholder taxonomy (vertical big tech, horizontal big tech, open source, academia) plus the base-model-as-reset-point dynamic that lets open ecosystems leap. A substantive structural read on the open-model ecosystem whose framing recurs throughout Lambert's later work.

open modelsecosystembase modelsacademiaChatGPT

'If it's not fully closed ML, it's open' - is it?

TIER 4 Jul 26, 2023

Lambert argues ML is straining the OSS community's definitions and needs a new taxonomy, using Llama 2 — which is downloadable but not open-source (700M-user clause, no-train-other-LLMs clause, no dataset) — as the focal case. He traces open-source history (GNU, OSI, the SSPL/MongoDB split), separates the replication vs. safety values of openness, and reflects honestly on 'open-source as vibes' rhetoric and the OSI's call to define open-source AI. A substantive, well-sourced framing of the open-source-AI definition debate that anticipated later OSI work.

open source definitionLlama 2 licenseOSImodel release taxonomydata transparency

Open, general-purpose LLM companies might not be viable

TIER 4 Oct 4, 2023

Prompted by Mistral's 7B torrent release, Lambert argues that 'train a good open LLM and release it with no product' is not a viable business strategy: smaller open labs can't out-spend OpenAI/Google/Meta, current open models are PR/recruiting tools rather than monetizable artifacts, and without data-sharing or specialization they're on an acquisition-or-bust path. He frames two futures — collaborative openness vs. status-quo consolidation — and urges specialization plus radical data transparency as the only durable moat. A prescient strategic argument about open-model economics that held up well over subsequent years.

open model economicsMistralmoatsdata transparencyscaling costs

Open LLM company playbook

TIER 4 Nov 1, 2023

A practical 3-prerequisites / 3-actions / 3-benefits playbook for companies releasing open-weight LLMs, centered on the thesis that open labs must own a niche and build products rather than compete head-on with OpenAI's compute platforms. A useful strategic framework for the business of open models, grounded in the author's HuggingFace/Zephyr experience.

open-modelsbusiness-strategyopen-sourcestartupsniche

Where 2024's 'open GPT4' can't match OpenAI's

TIER 4 Jan 5, 2024

Argues that even when open models (post-Gemini/Mixtral) match GPT-4's benchmark scores in 2024, they will still trail the real product because open models are just weights while ChatGPT is an extensive ML system (safety filters, serving, prompting) — and because open RLHF/DPO tooling remains a starting point, not a solution. The visible preview makes a clear models-vs-products distinction and calls for compounding investment in data and evaluation rather than vibes-based eval. (Paywalled; public preview only.)

open vs closed modelsGPT-4RLHFevaluationDPO

The koan of an open-source LLM

TIER 4 Mar 6, 2024

Proposes a clearer taxonomy for open models — 'Openly Trained Models' (OLMo/Pythia: data+code+weights) vs 'Permissible Usage Models' (Llama/Mistral/Gemma: weights only) vs Closed — and walks the OSI definition history, licenses/copyright, bio-risk debunking, and Mistral/EU politics. Matters as a foundational framing for AI-openness policy and the recurring confusion between 'open-source' and 'open-weight' that misleads policymakers.

open-source definitionmodel taxonomyAI policylicensescopyright

We disagree on what open-source AI should mean

TIER 4 Apr 3, 2024

Maps the open-LLM movement into four motivational camps (accelerationists/capitalists, transparency-driven scientists, inclusion/anti-concentration advocates, and freedom advocates) and argues disagreement on definitions is healthy and expected, not a failure. Introduces a disclosure/accessibility/availability framework for reading any 'open' release through the PR speak.

open-source-aiopenness-taxonomyai-policyecosystemtransparency

The end of the “best open LLM”

TIER 4 Apr 15, 2024

With Command R+, Mixtral 8x22B, DBRX and Grok arriving, argues there is no longer a single 'best open LLM' — the space has fragmented by use case. Builds a compute-vs-MMLU model showing most open-model gains over 18 months came simply from throwing more compute at the problem, and that accessibility now diverges sharply by parameter footprint.

open-modelscompute-efficiencymmluscalingmixture-of-experts

On the current definition of open-source AI and the state of the data commons

TIER 4 Aug 28, 2024

Dissects the OSI's draft v0.0.9 open-source-AI definition, explaining why the data clause is a deliberate compromise (provenance and copyright/GDPR jeopardy make full data release infeasible) and why a stable definition matters for regulatory carve-outs amid a shrinking data commons. It matters as a clear, authoritative treatment of how open-source AI is being formally defined and why it diverges from open-source software.

open source AIOSI definitiondata commonscopyrightAI policy

Why I build open language models

TIER 4 Oct 30, 2024

A personal manifesto, on his 30th birthday and first Ai2 anniversary, for why truly open (not Meta-dependent) language models matter: reducing concentration of power, security, research access, regulatory insight, and a more diverse AI economy. Describes his 'white rice research' building OLMo as open research infrastructure and a call for more people to fight for open AI before regulatory or market capture forecloses it. A clear statement of the open-source thesis that animates much of his other writing.

open sourceOLMoAI policyconcentration of powermanifesto

Making the U.S. the home for open-source AI

TIER 4 Feb 5, 2025

Argues the DeepSeek moment resets the open-vs-closed narrative and that open-source AI is driven by ideology more than economics (citing parallel 'national standard' framings from Zuckerberg and DeepSeek's Liang Wenfeng). Makes the policy case that restricting open model weights is a losing strategy — export controls should target compute, not weights — and that the US should fund open research and Western alternatives so the global open default is American. A timely, well-reasoned open-AI policy argument with concrete proposals.

open-source AIAI policyDeepSeekexport controlsUS-China

The American DeepSeek Project

TIER 5 Jul 4, 2025

Lambert's manifesto and multi-year mission statement: build a fully open-source (data, code, logs, weights) model at frontier scale within two years to keep the AI research default on Western technology and prevent a future split between closed American models and ubiquitous, hard-to-trust open Chinese ones. Lays out the structural advantages China holds, the $100M-500M cost estimate, the agentic-era window of opportunity, and why open models are a quintessentially American counterweight to concentrated corporate power. The defining argument that anchors much of his subsequent writing.

open source AIAmerican DeepSeekUS-ChinaAI governanceAi2

Towards American Truly Open Models: The ATOM Project

TIER 5 Aug 4, 2025

The full manifesto for the ATOM (American Truly Open Models) Project, arguing the US is losing open-model leadership to China and recommending at least one US lab with 10,000+ GPUs dedicated to training frontier open models, backed by detailed download/finetune-adoption data and the case for open models as the engine of AI research. This is the influential, foundational policy argument that anchors much of his subsequent writing.

ATOM Projectopen models policyUS vs ChinaAI researchmanifesto

Latest open artifacts (#16): Who's building models in the U.S., China's model release playbook, and a resurgence of truly open models

TIER 4 Nov 23, 2025

This roundup (paywalled preview) leads with a genuinely useful inventory of every serious US open-model lab (Ai2, Arcee, Google, IBM, Nvidia, OpenAI, Stanford Marin, etc.) with each org's representative model and license posture, plus an articulation of the four-step Chinese model-release playbook (social presence, day-zero ecosystem support with free API, Claude-Code-compatible coding subs, in-house tooling). It also flags the recurring problem of third-party providers mis-serving open models (Kimi K2 on Vending-Bench). The lab map and playbook give it lasting reference value.

open modelsUS labsChina release playbooklicensesmodel serving

2025 Open Models Year in Review

TIER 4 Dec 14, 2025

A year-end synthesis (paywalled preview) of the open-model year: DeepSeek R1, Qwen 3, and Kimi K2 named the top three for outsized ecosystem impact, with runner-ups (MiniMax M2, GLM-4.5, GPT-OSS, Gemma 3, Olmo 3) and niche winners (Parakeet 3, Moondream 3, Granite 4, SmolLM3). It culminates in a full tier list of US/China/world model makers and 2026 predictions, making it a useful reference map of the open ecosystem. Strong curation though catalog-shaped.

open modelsyear in reviewtier listDeepSeekQwen

8 plots that explain the state of open models

TIER 4 Jan 7, 2026

A data-driven walkthrough of eight ATOM Project charts showing China's accelerating dominance of open-model adoption metrics: Qwen alone out-downloads roughly the rest of the ecosystem, Llama remains the most-downloaded Western model despite being abandoned, new entrants (Z.ai, MiniMax, Kimi, Nvidia) barely register, and Chinese models remain the smartest open models. It matters as a quantitative, sourced baseline for the open-model adoption debate, with caveats on the noisiness of HuggingFace downloads. A genuinely useful empirical explainer.

open modelsChinaQwenadoption metricsATOM Project

Open models in perpetual catch-up

TIER 4 Feb 17, 2026

Lambert argues the recurring 'open models are closer than ever' narrative is overblown: the ~6-month gap is holding steady, and aggregate indices like Artificial Analysis compress too much and likely understate the true frontier, which has never been harder to capture in public benchmarks. He then walks seven trends including the brutally concentrated open-model market, the missing specialized small-model segment, sovereign AI as the entry point for nations, and China's idea-sharing ecosystem being the most likely place a 'who wins' breakthrough emerges. Substantive multi-trend synthesis even as a paywalled preview.

open modelsbenchmarkssovereign AIChinese AI ecosystemmodel adoption

What comes next with open models

TIER 5 Mar 16, 2026

Lambert's most structural open-models essay, arguing the open-closed gap is more likely to grow than shrink and laying out three future model classes: true closed frontier models, open frontier models, and small/cheap/specialized open models as 'distributed intelligence' that complement closed agents. Drawing on Google's 'Meaning of Open' and Gurley's Android-as-moat framing, he contends the under-served opportunity is boring, specific small models that orchestrating agents are desperate to call as tools, and that open AI must become an ecosystem rather than a pack chasing the frontier. Lasting reference value for framing open-model strategy.

open modelsAI ecosystemssmall modelsbusiness strategyAI systems

The inevitable need for an open model consortium

TIER 4 Apr 11, 2026

Argues that as frontier training costs reach billions and capitalism pushes labs to keep their best models closed, a multi-company consortium funding shared near-frontier open models becomes the only stable path, with Nvidia's Nemotron Coalition as an early single-company bootstrap. Predicts Chinese open-weight startups (Moonshot, MiniMax, Z.ai) hit funding stress first, and that demand for guaranteed open intelligence will eventually force the consortium model.

open model consortiumAI economicsNemotronfundingfrontier costs

My bets on open models, mid-2026

TIER 4 Apr 15, 2026

A consolidated 13-point thesis on the open-model ecosystem distilled from a spring of writing: closed models stay ahead on robustness, the race is mostly economic staying power (with Chinese labs facing funding stress first), open models win repetitive automation share, bans are impractical, and US adoption recovers from 2027. A useful, dense reference list of the author's positions.

open modelspredictionsAI economicsChina fundingregulation

Reading today's open-closed performance gap

TIER 4 Apr 20, 2026

Unpacks why reducing the open-closed gap to a single number (e.g. the Artificial Analysis Index) is misleading, explaining how the benchmarked 'frontier' shifts every 12-18 months across task paradigms and how RL environments (not just distillation) are the real lever letting Chinese labs keep pace. Argues the gap is increasingly about hard-to-measure robustness and private-data domains, leaving benchmark confidence at a low.

benchmarksopen vs closed gapRL environmentsevaluationRLVR

How open model ecosystems compound

TIER 4 May 12, 2026

Argues that since ~80% of a frontier model's compute goes to R&D rather than the final training run, China's all-open ecosystem gains a real cost-structure advantage by sharing insights across labs and avoiding double-spend, partially mirroring open-source software's compounding. Notes open AI lacks OSS's user-back feedback loop and that forking open tools into internal versions undermines the compounding, reinforcing the case for an open-model consortium.

open ecosystemR&D computeChinaOSS economicsconsortium

Open and closed models are on different exponentials

TIER 4 Jun 1, 2026

Argues the open vs closed balance is fundamentally economic: closed labs ride an integrated exponential by monetizing the top of knowledge work (coding agents people will pay a large premium for), evolving into Apple+Microsoft-like $2-10T oligopolies, while open models ride a slower, broader exponential capturing diffuse, commodity-priced enterprise inference. A clear framework for why both ecosystems grow on different curves. ---

open vs closedAI economicscoding agentsmarket structureinference

China's AI Labs and the Distillation Wars

3 tier-5 · 8 tier-4

Primary-source reporting and analysis on why China leads the open-model race. The centerpieces are a tiered ranking of nineteen Chinese open-model labs and a firsthand field report from visiting nearly every leading one, arguing Chinese research culture is built to be an ideal fast-follower. The release analyses (Kimi K2, Kimi K2 Thinking, DeepSeek V3, Ant/InclusionAI) track Chinese labs reaching and matching the frontier, while the distillation pieces cut through the "distillation attacks" panic -- quantifying its real (modest) impact, naming API abuse as the actual issue, and arguing distillation matters less in the RL era because on-policy generation can't be borrowed.

DeepSeek V3 and the actual cost of training frontier AI models

TIER 5 Jan 9, 2025

A definitive analysis of DeepSeek V3's training efficiency (MLA, MoE, multi-token prediction, FP8, custom comms) and a careful debunking of the viral '$5.5M model' narrative, showing the cited figure excludes research, ablations, salaries, and capex that put real annual cost closer to $500M-$1B+. It reframes how to think about frontier training cost and what 'open-source AI' actually requires, a widely-cited reference during the DeepSeek panic.

DeepSeek V3training costMoEcompute efficiencyopen models

What people get wrong about the leading Chinese open models: Adoption and censorship

TIER 4 May 6, 2025

Argues from insider hearsay that Western enterprises largely won't deploy Qwen/DeepSeek open weights despite leading performance, driven by information-hazard and code-backdoor fears rather than technical security, opening a real opportunity for permissively licensed Western open models (OLMo, Phi, Mistral). Pairs this with SpeechMap.ai data showing Chinese models are often less censored than expected on Western-relevant topics and that newer OpenAI models refuse more. Matters as a counterintuitive, evidence-backed reframing of the open-model competitive landscape and the adoption-vs-capability gap.

Chinese open modelsenterprise adoptioncensorshipopen-source licensingOLMo

Kimi K2 and when 'DeepSeek Moments' become normal

TIER 4 Jul 14, 2025

Analysis of Moonshot AI's Kimi K2, a 1T-param (32B active) permissively licensed open model competitive with frontier coding models, argued to be a second 'DeepSeek moment' showing HighFlyer is not unique, China is at/near the frontier, and the West keeps falling further behind on open models. Ties the release to algorithmic efficiency gains (similar token budget to DeepSeek V3), the Claude-compatible API driving fast adoption, and OpenAI's reactive open-model delay. A sharp release-analysis-plus-geopolitics piece.

Kimi K2open modelsChina AIDeepSeek momentfrontier models

Interviewing Ross Taylor on the state of AI: Chinese open models, scaling reasoning, useful tools, and what comes next

TIER 4 Jul 29, 2025

A deep, full-transcript interview with researcher Ross Taylor (ex-Meta, Galactica) covering why Chinese labs win on open models, why most lab failures are organizational rather than talent-bound, the limits of academic RL/reasoning research without compute, the rise of rubric-based rewards and the evals crisis, and AlphaEvolve as evidence the future isn't only RL. Substantive frontier-practitioner discussion with many concrete, reusable claims about training organizations and RL.

interviewRL/reasoning researchtraining organizationsevals crisisAlphaEvolve

Ranking the Chinese Open Model Builders

TIER 5 Aug 17, 2025

A definitive, tiered survey ranking 19 Chinese open-model labs by quantity and quality of open contributions — from frontier players (DeepSeek, Qwen) through close competitors (Moonshot/Kimi, Zhipu) to rising and honorable-mention orgs — with per-lab profiles of strategy, architecture, and licensing. This is a high-value, lasting reference map of the Chinese open ecosystem that he repeatedly cites and that drew direct engagement from the labs themselves.

China AI labsopen modelsDeepSeek/Qwenlab surveyreference

On China's open source AI trajectory

TIER 4 Sep 9, 2025

Examines whether China will double down on or change course from its open-source AI strategy, citing the 'AI+' plan, Premier Li's statements, Beijing's city-level model targets, and anecdotes about high engagement and morale in Chinese labs. It argues open models are shifting from 'soft power' to just 'power,' and that by 2026 Chinese open models may widen their performance/adoption lead with real geopolitical and regulatory consequences. A well-sourced policy-and-ecosystem analysis.

China AI policyopen source strategyAI+ plangeopoliticsopen models

5 Thoughts on Kimi K2 Thinking

TIER 4 Nov 6, 2025

A structured five-point analysis of Moonshot's Kimi K2 Thinking (1T total / 32B active MoE, 256K context), framing it as the closest open models have come to the closed frontier since DeepSeek R1. Key points: China's faster release cadence (gap ~4-6 months), the shift from benchmaxing to real user behaviors, native INT4 quantization-aware training reported at serving precision, 200-300 sequential tool calls with interleaved thinking, and intensifying pricing/mindshare pressure on US labs. A sharp, technically grounded model-release analysis.

Kimi K2 Thinkingopen modelsChinaQAT INT4tool useMoE

Interview: Ant Group's open model ambitions

TIER 4 Nov 12, 2025

A deep interview (with full transcript) with Richard Bian and Ling technical leads Chen Liang and Ziqi Liu of Ant Group's InclusionAI / Ant Ling, covering how a fintech giant became a frontier-lab contender in eight months, FP8 pre-training, modeling-strategy choices (size, multimodality), gaps in the open ecosystem, and why China is winning the open race. The transcript plus a substantive written overview of InclusionAI's model lineup give it real technical and strategic substance. A strong primary-source look inside a Chinese frontier lab.

Chinese AI labsInclusionAIAnt LingFP8 trainingopen modelsinterview

How much does distillation really matter for Chinese LLMs?

TIER 4 Feb 24, 2026

Responding to Anthropic naming DeepSeek, Moonshot, and MiniMax for industrial-scale distillation of Claude (16M exchanges via ~24K fraudulent accounts), Lambert quantifies the likely impact (150-400B tokens, meaningful but not crucial) and argues distillation is just a shortcut to more compute that everyone, including Ai2, effectively uses. His key technical point is that distillation matters less in the RL era, since on-policy generation dominates compute and can't be borrowed, so restricting distillation across distributed access is near-impossible and far less impactful than GPU export controls. Timely, well-calibrated analysis of a high-profile dispute.

distillationsynthetic dataChinese AI labsAnthropicUS-China AI

The distillation panic

TIER 4 May 4, 2026

Argues 'distillation attacks' is a dangerous misnomer: distillation is an industry-standard technique (Nemotron, Olmo, and even xAI use it) and the real issue is a few Chinese labs jailbreaking/hacking APIs, which should be named as abuse. Warns that conflating the two is fueling a snowballing regulatory push (congressional bill, executive order) that could effectively ban Chinese open-weight models and cripple Western academics and small labs.

distillationpolicyChinaopen-weight regulationterminology

Notes from inside China's AI labs

TIER 5 May 7, 2026

A firsthand report from visiting nearly every leading Chinese AI lab (Z.ai, Moonshot, Tsinghua, Meituan, Xiaomi, 01.ai, Alibaba, Ant), arguing China's culture makes it an ideal fast-follower: less ego, student-heavy teams, build-not-buy data, Claude-pilled developers, ownership mentality, and ambiguous-but-real government aid. A rare, high-signal primary-source account that reshapes how to read the open-model ecosystem. ---

ChinaAI labsresearch cultureopen modelsfield report

AI Progress, Scaling Limits, and the Takeoff Debate

2 tier-5 · 6 tier-4

Lambert's clear-eyed, repeatedly-revised position on how fast AI is actually moving. He separates the technical scaling claim (test loss still falls) from the product claim (user-visible chat gains are saturating), argues "emergent" abilities are mostly reliability gains, and pushes back on intelligence-explosion narratives -- from the AI-2027 software-singularity thesis to recursive self-improvement -- with his own "lossy self-improvement" framework: models become core to AI R&D, but friction keeps progress closer to linear than exponential. AGI, he argues, is an ungrounded symbol that bends to each speaker's goals.

AGI is what you want it to be

TIER 4 Apr 24, 2024

Argues AGI is an ungrounded symbol whose definition shifts to fit each speaker's goals, walking through academic, OpenAI-corporate, and 'modern test' definitions and showing how the Microsoft-OpenAI contract literally makes AGI a legal/financial question. Adds RL's outsized cultural hold on the discourse plus the real ceilings (power, data, agency) that bound any intelligence-explosion narrative.

agiai-discourseopenaireinforcement-learningcompute-constraints

How scaling changes model behavior

TIER 4 Oct 9, 2024

Grounds the scaling debate mechanically: power-law loss decreases don't imply AGI, and 'emergent' abilities are mostly gains in reliability, illustrated by a semiconductor-yield analogy where extra nines of per-token reliability compound over long token/agent sequences. Separates what scaling de-risks (next-gen loss) from the AGI storytelling it has no bearing on.

scaling lawsemergencereliabilityAGIagents

Scaling realities

TIER 4 Nov 14, 2024

Responding to 'scaling is dead' reports, Lambert argues both narratives are true: scaling laws (test loss vs compute) still work technically, but user-visible gains from a bigger GPT-5/Claude 4/Gemini 2-class chat model are slowing because chat goalposts are nearly saturated. The real frontier shifts to specialized models, agents, and new form factors like o1, leaving a large capability-product overhang. A clear, frequently-cited framing of the scaling debate that separates the technical claim from the product claim.

scaling lawsAI economicsGPT-5capability overhangagents

Deep Research, information vs. insight, and the nature of science

TIER 4 Feb 12, 2025

Uses OpenAI's Deep Research to ask what AI does to science, distinguishing 'information' (which current models accelerate enormously) from 'insight' (genuinely novel discovery, which they argue these models cannot yet produce — 'to an LLM, a novel discovery is indistinguishable from an error'). Contrasts grand AI-for-science projects (AlphaFold-style) with mass-market tools that compress the practice of normal science toward 'instantaneous PhDs,' and frames the coming strain on scientific institutions through Kuhn's paradigms. A thoughtful, distinctive essay on AI and the structure of scientific progress.

Deep ResearchAI for scienceinsight vs informationKuhnscientific institutions

State of play of AI progress (and related brakes on an intelligence explosion)

TIER 5 Apr 30, 2025

A rebuttal to the AI 2027 software-singularity thesis arguing benchmarks go vertical because labs hill-climb on training-goal evals (not held-out tests), AI is broadening not narrowing into AlphaGo-style superintelligence, ML research is bottlenecked on messy data work rather than compute-efficient implementation, and multi-domain RL will resemble slow robotics progress. Proposes the compute-share-shifting-to-inference signal as an empirical test. Matters as a clear, reusable framework for thinking about AI progress rates and the limits of recursive self-improvement.

intelligence explosionAI 2027benchmark saturationRL scalingAI research bottlenecks

Contra Dwarkesh on Continual Learning

TIER 4 Aug 15, 2025

Rebuts Dwarkesh Patel's claim that the lack of continual learning is a fundamental bottleneck, arguing that continual learning is a systems/context problem, not a learning-algorithm problem — solvable via memory, retrieval, and massive explicit context fed to reasoning models (he projects 2026-2027). The core move is rejecting the demand that LLMs learn 'like humans,' paralleling his reasoning argument. A pointed, well-argued contribution to a prominent timelines debate.

continual learningAGI debatecontext/memoryreasoning modelsDwarkesh

Thoughts on The Curve

TIER 4 Oct 7, 2025

Lambert refines his AI-timelines argument from The Curve conference: automating the AI research engineer is plausible in 3-7 years but full automation of AI research and a singularity are not, because compute scaling and rising system complexity will make progress feel linear rather than exponential. He pairs this with reflections on AI 'jaggedness' (via Helen Toner) and renewed urgency on open models versus China. A substantive, well-reasoned position piece on the pace of progress.

AI timelinesautomation of researchintelligence explosionopen modelscompute scaling

Lossy self-improvement

TIER 5 Mar 22, 2026

Lambert introduces an original framework, 'lossy self-improvement' (LSI), as a counter to recursive self-improvement (RSI): models do become core to the AI development loop, but friction (research being too narrow, diminishing returns of parallel agents per Amdahl's law, resource/political bottlenecks) breaks every assumption needed for a closed, self-amplifying exponential. He argues progress will feel like a huge step yet remain closer to linear, with no fundamental change convincing him takeoff is imminent. A durable conceptual contribution to the takeoff debate that names a distinct alternative model. ---

recursive self-improvementAI takeoffAGIautomated researchscaling

Frontier (Closed) Model-Release Analyses

1 tier-5 · 14 tier-4

What a major closed release actually means, judged on more than benchmark deltas. From GPT-4 onward through GPT-5, the Claude line (3.5, 4, Fable, Mythos), Gemini 2.5 Pro, the Grok releases, Qwen 3, and gpt-oss, Lambert reads each launch for its strategic signal: GPT-5 as proof AI is on a normal technological path (performance + price + product), Claude 4 as Anthropic's narrowing bet on code, Gemini 2.5 as Google's full-stack second chance, and the "post-benchmark era" in which release-day scores barely convey signal. The latest entries cover the AGI-era governance flashpoints around Claude Fable/Mythos and undisclosed safety nerfs.

GPT4: The quiet parts and the state of ML

TIER 4 Mar 20, 2023

A release-analysis of GPT-4 that deliberately ignores the hype and dissects the under-discussed details: the absence of architecture/data disclosure, the OCR-grade vision encoder, the central role of clean data as the real engine, and the argument that RLHF is far more 'significant' to making the model usable than OpenAI's report admits. Lambert also reads the societal/safety framing critically (anthropomorphizing, EA-aligned red-teamers, terms-of-service-as-policy) and predicts OpenAI's research lead will erode as product demands degrade its research teams. A substantive frontier-release breakdown that sets the template for his later model-review posts.

GPT-4model release analysisRLHFdata moatsAI safety discourse

OpenAI chases Her

TIER 4 May 15, 2024

Analysis of the GPT-4o launch (and Google's mirroring response) as both a product/culture inflection and a genuine architectural shift: 'omnimodal' models that natively ingest and emit audio/image/text tokens in one pass, eliminating the speech-to-text/LLM/text-to-speech handoff, evidenced by the new 200k-token tokenizer. Pairs the technical read with a pointed critique of OpenAI's safety posture turning product-first amid the Ilya/superalignment departures.

GPT-4oomnimodal modelstokenizerOpenAI vs GoogleAI safety

Claude's agentic future and the current state of the frontier models

TIER 4 Oct 23, 2024

Hands-on first look at Anthropic's Claude 3.5 Sonnet (New) and the Claude Computer Use beta, judging the agent an intuitive 'aha' moment despite refusals, rate limits, and latency, then a four-factor map of the frontier: Anthropic best for chat/coding, Google's Gemini Flash best small/cheap (via documented distillation), OpenAI best at reasoning (o1) but unclear how to use, and all labs sitting on larger unreleased models used mainly to train smaller ones. A useful synthesis of the late-2024 frontier with the insight that model strengths mirror org cultures.

Claudecomputer use / agentsfrontier modelsknowledge distillationGemini Flash

Grok 3 and an accelerating AI roadmap

TIER 4 Feb 18, 2025

Treats xAI's Grok 3 as evidence that frontier capability is no longer concentrated in OpenAI/Anthropic/Google and that competitive plus regulatory pressure (post-DeepSeek, post-Vance-AI-summit deregulation) will shrink the gap from training to release. Reads the thin 4-eval benchmark slate skeptically while crediting xAI's velocity, and steps back to question whether increasingly hard-but-unrepresentative evals actually track usefulness — calling for more transparency on internal evals. A good industry-trajectory essay framed around a release.

Grok 3xAIAI competitionderegulationevaluation

GPT-4.5: 'Not a frontier model'?

TIER 4 Feb 28, 2025

Reads OpenAI's strange GPT-4.5 release — the biggest model the public has tested, yet pitched as 'not a frontier model' — as the clearest signal yet that pretraining-scaling alone no longer yields obvious capability jumps. Estimates ~10x GPT-4 compute (~5-7T params), notes the gains showed up only in hallucinations/SimpleQA/emotional-intelligence rather than coding, and argues its real value is as a base for distillation into future reasoning models. A useful, honest model-release analysis (including a partial mea culpa on scaling expectations).

GPT-4.5scalingpretrainingmodel releasedistillation

Gemini 2.5 Pro and Google's second chance with AI

TIER 4 Mar 26, 2025

Calls Gemini 2.5 Pro the biggest eval jump in a while (40+ Elo clear on LMArena, second-largest top-model jump in LMSYS history) and uses it to argue Google has finally righted its strategic error and has the only full stack (models, TPUs, cloud). Pairs the release with a clarifying framing that reasoning is now a spectrum (DeepSeek V3 0324, hybrid reasoners) and argues Google should stop chasing ChatGPT and win as the AI platform via products and Cloud. Matters as a strong release-plus-strategy analysis of Google's resurgence and the commoditization-vs-distribution debate.

Gemini 2.5 ProGooglereasoning spectrumLMArenaAI platform strategy

Llama 4: Did Meta just push the panic button?

TIER 4 Apr 7, 2025

A pointed release analysis of Llama 4 (Scout/Maverick/Behemoth MoEs) arguing Meta botched it with bizarre Saturday timing, a misleading LMArena entry using an unreleased experimental chat model, an off-putting juvenile default model, an onerous license, and architectures aimed at the wrong (hyperscaler) audience rather than the GPU-poor open community Llama built. Concludes Llama is no longer the open standard and Meta's GenAI org faces cultural and strategic crisis. Matters as the definitive contemporaneous account of Llama's fall from open-default status.

Llama 4Metaopen modelsMoELMArena gaming

Qwen 3: The new open standard

TIER 4 Apr 28, 2025

A release analysis of Alibaba's Qwen 3 suite (6 dense models 0.6B-32B plus two MoEs, mostly Apache 2.0, thinking on/off toggles), arguing it validates the DeepSeek R1 recipe and distillation and crowns Qwen as the best-of-both-worlds open standard: DeepSeek-tier peak performance with a full Llama-style size ladder. Flags open questions on robustness, the SFT-distilled smaller models, lack of native multimodality, and undocumented training-token budgets. Matters as the definitive contemporaneous read on the model that displaced Llama as the open default.

Qwen 3open modelsDeepSeek recipedistillationMoE

Claude 4 and Anthropic's bet on code

TIER 4 May 27, 2025

A close release analysis arguing Claude 4 (Opus/Sonnet) is a reliability and reward-hacking-reduction step rather than a benchmark leap, with Anthropic narrowly curating coding/agentic evals and even regressing on some. Lambert questions the 'best coding model wins the AGI race' thesis and positions the major labs (OpenAI as consumer leader, Google as enterprise leader, Anthropic as the software-engineering specialist with a lower ceiling). Matters as a definitive read on Anthropic's strategic narrowing and on how benchmark presentation (parallel test-time compute, shaded bars) misleads.

Claude 4Anthropiccoding agentsbenchmark analysisAGI strategy

xAI's Grok 4: The tension of frontier performance with a side of Elon favoritism

TIER 4 Jul 12, 2025

A detailed (paywalled preview) breakdown of xAI's Grok 4: leading benchmarks (HLE, GPQA, ARC-AGI) and 10x RL compute, but middling vibe tests, an o3-like search-heavy style, weak product differentiation, and severe brand/culture risk. The thesis is that catching up on benchmarks is now easy while finding genuine usefulness as performance commoditizes is the singular challenge, and that o3's search reliability still hasn't been matched. Substantive release analysis even in preview form.

Grok 4xAIRL scalingfrontier modelssearch agents

gpt-oss: OpenAI validates the open ecosystem (finally)

TIER 4 Aug 5, 2025

Analyzes OpenAI's first open-weight release since GPT-2 (gpt-oss 20B/120B, Apache 2.0) as a strategic move that validates open models, reveals more of OpenAI's stack (raw CoT, harmony format, instruction hierarchy), and undercuts its own API pricing. It catalogs open questions (MXFP4 quantization, anti-finetuning safety, tool-use messiness, no base models) and argues this shifts the 'second derivative' for US open models without ending the China gap. A substantive, well-structured release breakdown.

gpt-ossOpenAIopen weightsMoE architectureATOM/open ecosystem

GPT-5 and the arc of progress

TIER 5 Aug 7, 2025

A definitive release analysis arguing GPT-5 — best-in-class across evals while cheap enough to serve ~1B users via a real-time router — proves AI is on a traditional technological path (performance + price + product) rather than an exponential takeoff, with abilities developing more slowly than products. It draws out implications for the AGI fundraising narrative, the product overhang, and Jevons-paradox adoption. A reference-quality read on what a major model release actually means.

GPT-5model release analysisAI progressrouter/systemAGI narrative

Gemma 4 and what makes an open model succeed

TIER 4 Apr 3, 2026

Lambert lays out a five-factor framework (performance/size, country of origin, license, day-one tooling, fine-tunability) for judging which open-weight model is worth investing in, arguing that benchmarks at release are a deeply incomplete story and that ease of adaptation is what actually decides adoption. He applies the framework to Google's Gemma 4, praising its move to an Apache 2.0 license and arguing its success will hinge on usability rather than a 5-10% benchmark swing. Useful because it reframes open-model evaluation around the post-release adoption realities most release coverage ignores.

open modelsGemma 4model licensingfine-tunabilityopen-source tooling

Claude Mythos and misguided open-weight fearmongering

TIER 4 Apr 9, 2026

Pushes back on the wave of anti-open-weight panic after Claude Mythos's cyber capabilities, arguing critics conflate a static open-closed gap with domain-specific risk and ignore the deployment realities (huge model size, harnesses, ~$10K/day inference) that limit proliferation. Acknowledges cybersecurity could be a genuine red line and proposes three concrete study questions rather than a blanket ban that would just cede open-model leadership to another country.

open-weight riskClaude MythoscybersecurityAI policymodel deployment

Claude Fable 5 and new AI safety fables

TIER 4 Jun 9, 2026

A detailed analysis of Claude Fable 5's release: it is the smartest public model yet, but ships with safety classifiers that downgrade some queries to Opus 4.8 and, more damningly, a silent, undisclosed safeguard that degrades frontier-LLM-development requests via prompt modification/steering vectors/PEFT. Lambert frames the undisclosed nerf as categorically misaligned and a market-entrenchment tactic dressed as safety, arguing it strengthens the case for open models. ---

AnthropicAI safetymodel releasedistillationclassifiers

Architectures, Multimodality, and Robotics Foundations

1 tier-5 · 18 tier-4

The technical substrate beneath the models, plus the RL/robotics roots of the author's thinking. The architecture pieces take non-attention models seriously (the landmark state-space/Mamba explainer, the Tri Dao / Michael Poli interview, hybrid RNN+attention, mixture-of-experts, model merging) and read multimodal/video releases (Sora, Gemini 1.5, Molmo, Apple Intelligence, robotic foundation models). The early RL essays -- written before the LLM era -- lay the conceptual groundwork Lambert later carried into RLHF: how all ML becomes RL, why reward is not enough, the slipperiness of reward specification, and RL as metaphor vs. tool vs. framework.

Clarifying RL: Obscure problem formulations and structure tradeoffs

TIER 4 Feb 19, 2021

An explainer on why RL is hard to define and deploy: the slipperiness of reward specification and agent/environment duality, the case for treating every reward as a slightly wrong interpretation, and three structural tradeoffs (model-based vs end-to-end, exploration vs offline, plus practical blockers like compute cost and walled-garden simulators). A genuinely useful conceptual map of RL's design tensions circa 2021.

RL fundamentalsmodel-based RLoffline RLreward specificationend-to-end learning

How all machine learning becomes reinforcement learning

TIER 4 Jun 14, 2021

Develops the thesis that any iteratively-retrained, user-facing ML system (recommendation, churn, ad delivery) is effectively a reinforcement-learning loop, exhibiting RL's core properties — feedback, policy fragility, exploitation, replay-buffer/distribution-shift dynamics — even when engineers don't call it RL. Distinguishes 'coursework RL' (update before next rollout) from the 'applied reduction to RL' (deploy, collect, retrain weeks later) and illustrates with Facebook A/B-stopping and NextDoor model-tracking bloopers. A clear, original conceptual framing.

reinforcement learningfeedback loopsrecommendation systemsdistribution shiftMLOps

Reward is not enough

TIER 4 Jun 21, 2021

A critique of the RL 'reward hypothesis' and the Silver/Singh/Precup/Sutton 'Reward is Enough' paper, arguing that while a scalar reward may exist for any single agent, finding it is impractical (rewards are non-stationary, infinitely tunable, and break down under multi-objective and societal-scale optimization). Ties the argument to dopamine/neuroscience and to attention-metric optimization in social media, warning against companies covertly re-tuning users' reward functions. A substantive conceptual essay foreshadowing his later RLHF reward-modeling work.

reward hypothesisRL theorymulti-objective optimizationAI ethicsneuroscience

Designing Societally Beneficial Reinforcement Learning Systems

TIER 4 Feb 8, 2022

Announces and summarizes the academic paper 'Choices, Risks, and Reward Reports', which formalizes three types of RL feedback (control, behavioral, exogenous) and four design risks (scoping the horizon, defining rewards, pruning information, training multiple agents), then proposes Reward Reports as living documentation for automated decision systems and maps governance/legal entry points. A genuine policy-framework contribution that grounds Lambert's recurring documentation thesis, presented here as a guided summary with reader-by-background pointers.

RL governanceReward ReportsAI policyreward hackingdocumentation

Pretraining quadrupeds: a case study in RL as an engineering tool

TIER 4 Jan 16, 2023

Uses the 2018-2022 arc of quadrupedal locomotion papers (ANYmal, A Walk in the Park, egocentric-vision end-to-end work) to argue RL's real success metric is taking over a vertical through repeated independent reproduction, not flashy AlphaGo-style one-offs. Sim-to-real pretraining for legged robots is RL's first proven engineering vertical, and the field should tout that — not DeepMind game wins — as the model of success. A well-curated literature walk that doubles as a thesis about redefining what counts as RL working.

reinforcement learningquadruped locomotionsim-to-realroboticsbitter lesson

Scaling laws for robotics & RL: Not quite yet

TIER 4 Feb 1, 2023

Argues that scaling laws have not meaningfully arrived for RL/robotics because the field hasn't laid out what scaling should even look like in decision-making, and that single-environment scaling (DreamerV3, the just-published single-agent scaling-laws paper) misses the point. Proposes three prerequisites — avoiding the closed-world 'games effect', integrating exploration to escape Gato-style data curation, and demonstrating generalization before scaling — concluding that environments, generalization, and exploration, not scale, are RL's real bottlenecks. A substantive, opinionated explainer on why the bitter lesson hasn't transferred to RL.

scaling lawsreinforcement learningroboticsgeneralizationexploration

Three seasons of RL: Metaphor, tool, and framework

TIER 4 Feb 15, 2023

Co-authored framework distinguishing three ways RL is invoked — as a metaphor for general intelligence ('reward is enough', DL+RL=AGI), as an engineering tool (RLHF, quadruped locomotion), and as a framework for understanding any deployed feedback system (recommenders, predictive policing). Argues that conflating these three framings is the root of miscommunication about RL progress, and that documentation tools like Reward Reports must capture the time-evolving, optimization-intent dimension that Model Cards miss. A genuinely useful conceptual lens on RL discourse.

reinforcement learningRL taxonomyReward ReportsdocumentationAGI

Can robotics take off like GenAI? Moravec's paradox vs. scaling laws

TIER 4 Sep 29, 2023

Lambert (a robotics/RL PhD) traces Google Brain's scaling progression (QT-Opt → RT-1 → RT-2 vision-language-action models) and weighs it against Moravec's paradox — the observation that sensorimotor control is harder to engineer than reasoning. His thesis: robotics research is progressing well but environment-transfer is far harder than language transfer, so expect slow-burn domain-specific gains rather than a GenAI-style takeoff, and he's openly skeptical of humanoid-robot hype. A substantive explainer connecting RL/robotics history to scaling-law expectations, though only the public preview of this paywalled post is present.

roboticsMoravec's paradoxscaling lawsRT-2 / VLAreinforcement learning

State-space LLMs: Do we need Attention?

TIER 5 Dec 20, 2023

Landmark deep-technical explainer of non-attention architectures, framing Mamba and StripedHyena reaching Llama-2/Mistral-7B territory as the moment to take SSMs seriously. Walks through the attention-vs-recurrence tradeoff, the SSM continuous/discrete formulation, Mamba's selection mechanism and hardware-aware scan, plus open challenges (GPU utilization, fine-tuning, in-context learning) — a durable reference on alternative LLM architectures.

state-space-modelsmambastripedhyenaarchitecturesRNN

Interviewing Tri Dao and Michael Poli of Together AI on the future of LLM architectures

TIER 4 Dec 21, 2023

Full-transcript interview with Tri Dao (Mamba, FlashAttention) and Michael Poli (StripedHyena) on why attention scales quadratically, how SSMs/linear-RNNs/linear-attention converge on one mathematical core, and hardware-aware design. Both predict attention persists as a primitive but hybrid architectures and architectural innovation grow, with data quality (not architecture) setting the scaling-law slope.

state-space-modelsmambaattentionarchitecturesinterview

Local LLMs, some facts some fiction

TIER 4 Jan 24, 2024

Argues that local on-device LLMs will win on latency (and power-efficiency scaling laws) rather than the often-cited personalization angle, since OS-level inference avoids cloud round-trips and sandboxing bottlenecks that make real-time audio viable. Extends this into a strategic read of Big Tech — why Apple, Google's Pixel TPU, and Meta's open-source play are tied together — making it a substantive analysis of where the local-inference market is headed.

local LLMslatencyon-device inferenceAppleMeta strategy

Model merging lessons in The Waifu Research Department

TIER 4 Jan 29, 2024

A technical explainer (co-written with AI2's Jacob Morrison) on why model merging — averaging the weights of separately fine-tuned models — actually works, tracing it back through stochastic weight averaging, linear mode connectivity, and flat-minima generalization, with a quote from SWA author Andrew Gordon Wilson and a visual literature review. Useful because it grounds a meme-y GPU-poor technique (popularized by the anime/'waifu' merging community) in decades of optimization theory and distinguishes it from ensembling and mixture-of-experts.

model mergingweight averagingSWAmixture of expertsliterature review

OpenAI’s Sora for video, Gemini 1.5's infinite context, and a secret Mistral model

TIER 4 Feb 16, 2024

Release-day technical summary of three simultaneous events: OpenAI's Sora (diffusion transformer over video patches, emergent physical-world simulation, likely YouTube + procedurally generated data), Gemini 1.5 Pro (MoE, up to 10M-token context implying a non-Transformer routing scheme, near-1.0-Ultra quality), and a stealth Mistral-next model in the arena. Matters as the definitive concise first-pass analysis of a landmark day for video generation and long context.

SoraGemini 1.5long contextdiffusion transformerMistral

10 Sora and Gemini 1.5 follow-ups: code-base in context, deepfakes, pixel-peeping, inference costs, and more

TIER 4 Feb 19, 2024

A ranked ten-item follow-up digging into Sora and Gemini 1.5: deepfake/Gaussian-splatting coherence tests, whole-codebase-in-context use, Gemini's contamination and citation oddities, YouTube token-count Fermi estimates, Sora as a world-simulator for model-based RL/robotics, Midjourney style overlap, inference costs, and resulting pressure on Llama/Mistral. Matters as a dense set of practical and technical observations that go well beyond the release-day overview.

SoraGemini 1.5long contextworld modelsinference costs

A realistic path to robotic foundation models

TIER 4 Jun 5, 2024

Argues robotics is undergoing the same 'everything is a token / transformerification' shift as LLMs, with new foundation-model labs (Physical Intelligence) aiming to be the 'OpenAI for robotics' via cross-robot, language-promptable policies. Lays out the four gating factors — multi-robot policies, plain-text prompting, teleoperation markets, and crucially the manufacturing cost of robots and data scarcity — explaining why prior robot-learning startups were too early and what has to break for this generation to succeed.

roboticsfoundation modelsPhysical Intelligenceembodied AIscaling

AI for the rest of us

TIER 4 Jun 12, 2024

Reads Apple Intelligence as a bet that AI follows prior tech revolutions (incremental, embedded, privacy-preserving) rather than a winner-take-all race, then digs into the disclosed technicals: a ~3B on-device model plus a GPT-4-class server model, adapter/quantization orchestration, and notably two novel post-training algorithms (rejection sampling with a teacher committee, and RLHF via mirror descent policy optimization with a leave-one-out advantage estimator). It matters because Apple putting a non-PPO RLHF algorithm into a shipping product signals current PPO recipes aren't optimal and validates on-device models.

Apple IntelligenceRLHFMDPO vs PPOon-device modelspost-training

Interviewing Tim Dettmers on open-source AI: Agents, scaling, quantization and what's next

TIER 4 Nov 7, 2024

A full, deep technical interview with Tim Dettmers (QLoRA/bitsandbytes author, then Ai2, incoming CMU professor) covering the state of open-source models, agents and SWE-Bench, doing high-impact research while GPU-poor, model merging and optimization landscapes, knowledge distillation, state-space models vs transformers, and the fundamental limits of quantization. Substantive technical discussion from a leading open-source researcher with full transcript. Worth reading for its grounded takes on efficiency and academic AI.

interviewquantizationopen sourceagentsmodel merging

Interviewing Eugene Vinitsky on self-play for self-driving and what else people do with RL

TIER 4 Mar 12, 2025

A deep technical interview (with full transcript) with NYU professor and RL researcher Eugene Vinitsky on scaling self-play in simulated multi-agent RL — including the Gigaflow self-driving result where a single shared-weight policy with randomized rewards produces human-like driving without ever seeing humans drive. The conversation distills how self-play, inference-time compute, and RL scaling laws relate to the current RL-for-LLM takeoff, plus LLMs as in-context reward designers. Substantive enough on the research to be worth reading beyond the episode notes.

reinforcement learningself-playself-drivingmulti-agent RLpodcast

Olmo Hybrid and future LLM architectures

TIER 4 Mar 5, 2026

Announcing Ai2's Olmo Hybrid (a 7B attention+GDN model), Lambert explains why hybrid RNN/attention architectures are being adopted everywhere at once and shares the accompanying paper's theory that hybrids are strictly more expressive than transformers or GDN alone and translate that to better token efficiency (~2x pretraining gain). He is candid about the mixed post-training results and 'horrific' open-source tooling that currently erases inference-throughput gains, and speculates a frontier closed model being an RNN is roughly a coin flip. Strong technical explainer grounded in a real release and paper. ---

hybrid architecturesGated DeltaNetOlmoscaling lawspost-training

Evaluation and Benchmarks

0 tier-5 · 11 tier-4

A sustained critique of how the field measures models -- and why it keeps getting fooled. Lambert shows how ChatBotArena rewards style and low refusal over capability, how Big Tech "evals" are marketing that closed labs can't run fairly on rivals, and how the whole enterprise is sinking into "evaluation quicksand" of contamination, private suites, and unreproducible scores. He defends the open leaderboard as a discovery tool while announcing RewardBench, the first reward-model benchmark, and argues that deploy-and-chat "vibes" evaluation -- and ultimately causal/A-B testing -- is where real model research happens.

Evaluating and uncovering open LLMs

TIER 4 May 31, 2023

A thorough early treatment of LLM evaluation: the limits of the two contemporary leaderboards (HuggingFace Open LLM vs. LMSys ChatBot Arena), prompt/tokenization sensitivity, paper-vs-reproduction gaps and eval leakage, and the biases of GPT-4-as-judge and human position bias. Argues for curated held-out per-task prompt sets and ultimately causal/AB-tested evaluation—a substantive, durable explainer on a topic central to the archive.

evaluationleaderboardsChatbot ArenaLLM-as-judgebenchmarks

In defense of the open LLM leaderboard

TIER 4 Sep 13, 2023

Responding to SemiAnalysis's claim that HuggingFace's Open LLM Leaderboard 'actively hurts' open source, Lambert (a contributor) explains how the leaderboard actually functions as a discovery and oversight tool, who uses it across five user segments, its community moderation against benchmark-gaming, and the often-misunderstood probability-based scoring of multiple-choice evals. He rebuts the GPU-rich/poor framing's dismissal of fine-tuning-focused companies and previews the fall open-model wave. A genuinely useful explainer of LLM evaluation mechanics and open-ecosystem dynamics.

open LLM leaderboardevaluationHuggingFacebenchmark gamingSemiAnalysis rebuttal

How the Foundation Model Transparency Index Distorts Transparency

TIER 4 Oct 26, 2023

Detailed critique (co-written with EleutherAI) of Stanford CRFM's Foundation Model Transparency Index, arguing it measures product documentation rather than true transparency, can't verify closed-model claims, uses a misleading scorecard, and is systematically biased against open models. An influential argument reframing transparency as openness in good faith; public-preview only but the visible critique is substantive.

transparencyFMTIopen-vs-closedAI-policyevaluation

The interface era of AI

TIER 4 Nov 15, 2023

Argues that automated benchmarks miss what matters and that the deploy-and-chat workflow (vibes-based evals, MT-Bench/AlpacaEval as proxies) is now central to good model research, giving an engineering edge to teams with fast checkpoint-to-endpoint tooling. A useful framing of why deployment engineering and real interaction, not eval numbers, drive progress — and why this advantages cheap small models.

evaluationvibes-evalsdeploymentresearch-workflowtooling

Big Tech's LLM evals are just marketing

TIER 4 Dec 13, 2023

Argues that Big Tech benchmark comparisons (Microsoft's Medprompt MMLU plot, Google's Gemini-vs-GPT4 chart) are marketing not science, since closed labs cannot fairly evaluate competitors' models they can't access, and contamination plus undisclosed training data make scores untrustworthy. Recommends judging models by hands-on chat, pushing in-context learning over prompting hype, and building on Eleuther's eval harness for the open ecosystem.

evaluationbenchmarksMMLUopen-vs-closedin-context-learning

How to cultivate a high-signal AI feed

TIER 4 Feb 28, 2024

A ten-point practical heuristic guide for separating signal from noise in ML content — prioritizing demos/model access, depth-vs-breadth focus, reproducibility, no-free-lunch sanity checks (scaling beats small, simple beats complex), distrust of leaderboard-only claims, and knowing publisher incentives. A genuinely useful, transferable explainer for anyone trying to keep up with AI releases.

information dietevaluation literacyreproducibilitymediadeep learning heuristics

Evaluations: Trust, performance, and price (bonus, announcing RewardBench)

TIER 4 Mar 20, 2024

Argues LLM evaluation is getting harder along three axes — trust (few orgs have no incentive to cook the books), rising price (human and API-credit costs now price out academics and open labs), and vibes-based judgment — with a case for government-funded hidden eval sets. Bundles the launch of RewardBench, the first reward-model evaluation benchmark, covering 30+ RMs across chat/safety/code/math.

evaluationrewardbenchreward-modelsrlhfbenchmarks

ChatBotArena: The peoples’ LLM evaluation, the future of evaluation, the incentives of evaluation, and gpt2chatbot

TIER 4 May 8, 2024

A comprehensive insider analysis of ChatBotArena — what it is, its Elo statistics and noise floor, who actually needs it (attention-economy chat companies, not enterprise), how style/length 'doping' games it, the 'preference data gap' versus what labs buy from Scale, and where LLM evaluation is heading. Even as a paywalled-preview, the visible argument is a rich, reference-quality treatment of the most influential public LLM eval; docked from 5 only because the incentives/gpt2chatbot section is truncated.

ChatBotArenaLLM evaluationElo / preference datastyle biasLMSYS

GPT-4o-mini changed ChatBotArena

TIER 4 Jul 31, 2024

Argues that GPT-4o-mini's surprise top-tier ChatBotArena ranking exposes how the leaderboard rewards style (lists, line breaks, upbeat tone) and low refusal rates rather than peak capability, and surveys partial fixes (hard-prompt categories, Scale's leaderboard) and their conflicts of interest. It matters as an influential, well-evidenced critique of crowd-vote LLM evaluation and the style-vs-substance problem in RLHF.

evaluationChatBotArenaRLHFmodel styleGPT-4o-mini

Interviewing Riley Goodside on the science of prompting

TIER 4 Sep 30, 2024

A full-transcript masterclass with Scale AI's Riley Goodside on why prompting matters and how it interacts with post-training and evaluation: ChatGPT/ChatML as the biggest event in prompt-engineering history, prompting as the experimental edge that gets absorbed into models, o1's odd steerability, and equal-inference-cost eval comparisons. Rich, durable reference on prompting craft and eval methodology.

promptingevaluationo1post-traininginterview

Building on evaluation quicksand

TIER 4 Oct 16, 2024

Argues that LLM evaluation has become unreliable 'quicksand': closed labs tune private eval suites that can't be reproduced, the open community has failed to standardize tooling, and synthetic data introduces new contamination vectors. Calls for the open ecosystem to converge on a common evaluation standard and previews an economy of expensive expert-built evals (Humanity's Last Exam). ---

evaluationbenchmarkscontaminationopen modelssynthetic data

AI Policy, Governance, and the Definitions of Open

0 tier-5 · 12 tier-4

Where Lambert turns researcher knowledge into policy argument. The throughline: regulate deployed systems and known harms, not model weights or compute thresholds -- so SB 1047 is on the wrong side of history, export controls should target compute not weights, and Model Specs are the right transparency-first abstraction for light-touch regulation. He co-authored Ai2's OSTP comment and reads the White House AI Action Plan, defends the NAIRR as public AI infrastructure, untangles the OSI open-source-AI definition, and warns that the AGI-era of governance (gating releases on vibes) will eventually hit open models too. Includes the policy-focused interviews with Dean Ball, Arvind Narayanan, and Andrew Trask.

AI researchers' challenges: atomic analogies and strained institutions

TIER 4 Sep 6, 2023

Lambert dismantles the popular 'Manhattan Project for AI' analogy — AI has no clear target unlike the bomb, leaves no radioactive signature for monitoring, and emerges amid distrust of academic institutions and Arxiv-driven mass participation — while drawing real lessons about scientists losing political sway. The second half is a sharp analysis of how social-media algorithms, embellishment incentives, and incomplete corporate participation are destabilizing AI research norms and credentialing. A substantive sociology-of-AI-research essay with several transferable observations.

Oppenheimer analogyAI governanceresearch normsscientific distributioninstitutions

Interviewing Dean Ball on AI policy: CA SB 1047, upcoming AI disaster response, Llama 3 405B, Chinese open-source AI, and scaling laws

TIER 4 Jun 27, 2024

A deep, full-transcript interview with policy scholar Dean Ball that walks through SB 1047's mechanics (Frontier Model Division, the $100M threshold, lowerable fine-tune FLOP thresholds, mandatory 2028 audits) and the state-legislature politics that determine its fate, then ranges over how a minor AI 'disaster' would likely be blamed on AI without forensic proof, whether Meta releases the 405B, export controls and China's compute ceiling, synthetic-data license enforceability, and a skeptical read of scaling-law/intelligence framings. Matters because the transcript carries substantive, well-informed analysis from a frontier AI-policy voice rather than a thin episode promo.

AI policySB 1047interviewChina export controlsscaling laws

SB 1047, AI regulation, and unlikely allies for open models

TIER 4 Jul 17, 2024

An influential policy argument that California's SB 1047 is on the wrong side of history by regulating models (and compute/FLOP thresholds) rather than deployed systems, with the standout principle that developers should be liable only for known harms native to the model, not downstream derivatives. It also reads the broader cultural shift — antitrust pressure on big tech, China/national-security stakes, and the politicization of open source — as unlikely allies of open weights, and closes with a concrete list of what Lambert would actually regulate (CSAM/deepfakes, human-content watermarking, PII deletion, researcher access). Matters as a coherent open-models-favorable governance worldview at a pivotal regulatory moment.

SB 1047AI regulationopen weightsantitrustcompute thresholds

A post-training approach to AI regulation with Model Specs

TIER 4 Sep 9, 2024

Argues that AI regulation should move away from compute thresholds toward auditing post-training, with mandated Model Specs (OpenAI-style behavior documents) as the regulatory abstraction that distinguishes intentional from unintentional model behaviors. It matters as an original, transparency-first policy framework that bridges safety and acceleration camps without forcing labs to disclose trade secrets.

AI policymodel specspost-trainingregulationtransparency

AI Safety Culture Confronts Capitalism

TIER 4 Oct 2, 2024

Argues that AI Safety culture is structurally losing to capitalism as labs need ~$100B for next-gen models, forcing them to 'appear normal' to investors (OpenAI shedding nonprofit governance) and dissolving the structural incentives that once distinguished them. Pairs this with SB 1047's veto as a litmus test, concluding that staying inside established labs is more impactful than repeatedly forking new orgs.

AI safetyAI policySB 1047OpenAIindustry economics

Interviewing Andrew Trask on how language models should store (and access) information

TIER 4 Oct 10, 2024

A full-transcript interview with OpenMined's Andrew Trask on secure enclaves for pre-release model testing (Anthropic/UK AISI), structured transparency, data-store language models as a better path to 'open training data,' and running models on air-gapped networks. Substantive on privacy-preserving infrastructure and the future governance of model access.

secure enclavesprivacystructured transparencyOpenMinedAI governance

Interviewing Arvind Narayanan on making sense of AI hype

TIER 4 Oct 17, 2024

A full-transcript interview with the AI Snake Oil co-author on disentangling AI hype from reality, the capability-reliability gap in agents, why generality is a red herring for economic AGI, and diffusion as a speed limit on innovation. Substantive on AI policy, agent evaluation (CORE-Bench), scaling, and the predictive-vs-generative-AI distinction.

AI hypeagentsAGIAI policyevaluation

Saving the National AI Research Resource & my AI policy outlook

TIER 4 Nov 13, 2024

A policy argument to save the National AI Research Resource (NAIRR), which loses funding January 2025 absent congressional action, framing it as critical infrastructure to keep academic and non-profit AI relevant against big-tech buildout. Distinguishes an 'AI' vs a narrower 'language model' research resource, makes the case for prioritized compute allocation over 'democratizing AI', and rounds up post-election policy risks (state legislation, Elon vs Trump, anti-open-source FUD, agents). A substantive, well-sourced policy piece with lasting relevance to the public-AI-infrastructure debate.

AI policyNAIRRpublic computeopen sourceregulation

Transparency and (shifting) priority stacks

TIER 4 Apr 28, 2025

Uses OpenAI's undocumented quiet GPT-4o update to argue AI is becoming a 'normal technology' where product trumps research transparency, then lays out a 'priority stack' framework for understanding why different actors want different kinds of openness (capability, base-model, reward-model, training-spec, structured access). Distinguishes concentration-of-power concerns from intelligence-explosion concerns as the drivers of transparency demands. Matters as a structured taxonomy of transparency and a clear read on OpenAI's shift away from documented releases.

transparencyopennessOpenAIpriority stackAI governance

The White House's plan for open models & AI research in the U.S.

TIER 4 Jul 23, 2025

An annotated read of the White House AI Action Plan's open-model and AI-research provisions, by a co-author of Ai2's official OSTP comment, arguing the plan rightly endorses more investment in open models (compute markets, NAIRR, NTIA adoption drives) for the right reasons. Lambert flags the gap it leaves — building strong fully open models, not just dispersing compute — plus missing immigration policy and the slippery 'Chinese values' evaluation mandate. Useful as an expert policy reading tied directly to his American DeepSeek thesis.

AI policyopen modelsWhite House Action Planresearch computeUS-China

Dean Ball on open models and government control

TIER 4 Mar 6, 2026

A podcast (full transcript) with policy analyst Dean Ball arguing that the U.S. Department of War's designation of Anthropic as a supply-chain risk, while bad, points toward open models being the stable 5-10 year equilibrium for power centers, because no global entity will let a single U.S. company control its relationship to the most important technology. The discussion covers funding open models against a widening frontier gap, sovereign AI and foreign distrust of closed models, nationalization risk, and Ball's idea of financializing compute. Substantive policy reasoning, though it is an exploratory conversation rather than a definitive argument.

AI policyopen modelsgovernment controlsovereign AIAnthropic

Welcome to the AGI era of AI governance

TIER 4 Jun 14, 2026

Argues that the U.S. government forcing Anthropic to suspend foreign access to Claude Fable/Mythos is the 'starting gun' of a new AGI-era of AI governance, where releases get gated on vibes by a technically thin executive branch. Lays out lasting positions (export bans are bad policy, Anthropic's fear-mongering accelerated this, the open community shouldn't celebrate) and warns the same heavy-handed treatment will eventually hit open models. ---

AI governanceAnthropicexport controlspolicyopen models

AI Economics, Moats, and Industry Structure

0 tier-5 · 9 tier-4

How value actually accrues in AI. Lambert argues the model itself is not a moat -- weights leak, get distilled, get commoditized -- so durable advantage lives in data, distribution, inference economics, and product. He applies Aggregation Theory to ask whether inference-time compute breaks zero-marginal-cost economics, sizes the real exponential growth of inference usage, dissects the data-foundry (Scale AI) and alignment-as-a-service businesses, reframes the "data wall" as an open-ecosystem problem rather than a frontier one, and -- in "Burning out" -- argues the binding constraint on frontier AI is shifting from financial to human capital.

Predicting machine learning moats

TIER 4 Dec 28, 2022

Argues that in ML the model itself is not the moat — models leak, get fine-tuned, or get distilled via API outputs — so data, infrastructure, and feedback loops are the durable advantages. Introduces emergent behavior as a new kind of moat: because abilities appear nonlinearly past data thresholds, concentrated user data (and the feedback loop it feeds) can confer lasting advantage in a way the 'data moats are fake' a16z thesis didn't anticipate. A useful, still-relevant business-strategy framework for AI companies.

ML moatsdata advantageemergent abilitiesmodel distillationAI business strategy

Alignment-as-a-service: Scale AI vs. the new guys

TIER 4 Feb 7, 2024

Analyzes Scale AI's ~$750M annualized revenue from selling RLHF/human-preference data and whether any moat protects a labor-heavy data-services business, then sketches 'alignment-as-a-service' (AaaS) as a startup category built on recurring model-monitoring and continual-training rather than one-off data labeling. Matters because it frames the economics of the RLHF supply chain and the existential question of what happens to these businesses if synthetic data or stronger base models reduce the need for human preference data.

RLHFScale AIdata labelingAI economicssynthetic data

Model commoditization and product moats

TIER 4 Mar 13, 2024

Argues that as GPT4-class capability is replicated across Gemini, Claude 3, Mistral Large, and Inflection, the model itself is no longer a moat — durable advantage shifts to product, sticky user habits, distribution, cheap inference as a loss-leader, and eventually advertising economics. Matters because it reframes the 'no moat' debate around system/product moats versus model moats and flags that open ecosystems still face an unsolved data-coordination ('sinkhole') problem in preference alignment.

model commoditizationmoatsopen vs closedinference economicsRLHF data

We aren’t running out of training data, we are running out of open training data

TIER 4 May 29, 2024

Reframes the 'data wall' narrative: frontier labs aren't out of data, the open ecosystem is — closed labs can self-generate ~1 trillion synthetic tokens/day from inference traffic for a few million dollars, sign exclusive licensing deals (Reddit, news, Stack Overflow) as legal moats, and use search/best-of-N to manufacture better tokens. The thesis matters because it predicts the data bottleneck pinches open players and small labs (Mistral, Cohere) rather than OpenAI/Google, with human data as the last expensive frontier.

data wallsynthetic datadata licensingopen vs closedscaling

Futures of the data foundry business model

TIER 4 Sep 11, 2024

Dissects Scale AI's 'data foundry' business as caught between rising RLHF demand and synthetic data eating the human-instruction market, with a rare insider account of what procuring human data from vendors actually involves. Concludes that data foundries are aggregators (Uber/Airbnb-tier valuations, not Apple/Meta) increasingly exposed to synthetic data and Nvidia capturing the margins.

data foundryScale AIRLHFsynthetic dataAI industry economics

Where inference-time scaling pushes the market for AI companies

TIER 4 Mar 5, 2025

Applies Ben Thompson's Aggregation Theory to ask whether inference-time compute breaks the zero-marginal-cost economics that powered internet giants, concluding that consumer chat will stay aggregator-shaped (ad-supported, near-zero cost) while inference-heavy, high-value tasks push AI companies toward platform economics and a 'barbell' market. Argues parallel sampling plus strong verifiers (not just longer single generations) is the real scaling axis, invoking Jevons' paradox and Noam Shazeer's 'making models more expensive is worth it.' A solid economics-of-AI explainer connecting reasoning models to business structure.

inference-time scalingaggregation theoryAI economicsverifiersbusiness models

Managing frontier model training organizations (or teams)

TIER 4 Mar 19, 2025

Drawing on off-the-record talks with frontier-lab leadership plus a detailed Tulu 3 case study, Lambert lays out how to structure model-training teams: keep core modeling teams small, preserve bottom-up information flow, scale only where co-design isn't needed, and avoid the politics that make large orgs 'unable to put it together.' The Tulu 3 walkthrough (project length, researcher/engineer ratio, ~1000 checkpoints, 8B/70B/405B iteration split, compute as the binding constraint) is a rare concrete look at how a mid-sized post-training effort actually runs. Valuable operational knowledge that is otherwise a closely guarded secret.

org designtraining teamspost-trainingTulu 3management

People use AI more than you think

TIER 4 May 21, 2025

Uses Google I/O's token-throughput slide (480T+ tokens/month, up from ~10T a year prior) plus Azure and OpenAI figures to argue AI inference usage is growing exponentially and is largely profitable, with reasoning models and code agents about to push per-task token use 10-100x higher. The framing that Google now processes more tokens monthly than Common Crawl holds, and that the internet is being rebuilt as AI-first, is a useful quantitative grounding. Matters for sizing the real economics and trajectory of AI adoption beyond saturated benchmarks.

AI usage growthinference economicstoken throughputGoogle I/Oindustry scale

Burning out

TIER 4 Oct 25, 2025

A reflective essay on the brutal work culture of frontier AI (996/997/002, 100-hour weeks) and why the closing window to stay at the cutting edge is real, not just perceived. Lambert draws an elite-athletics analogy (team culture beats individual talent) and advances a notable thesis: the binding constraint on AI is shifting from financial capital to human capital, since stabilizing and replicating known-good recipes takes focused grinding that money can't shortcut — a structural argument for why from-scratch labs (SSI, Reflection) face long odds. Substantive industry analysis beyond the personal frame. ---

AI work cultureburnouthuman capitalfrontier labsteam culture

Coding Agents and the Agentic-Work Transition

0 tier-5 · 9 tier-4

The most recent capability inflection and what it does to how we work. Lambert calls coding the epicenter of AI progress -- the last broadly tractable frontier and the template for everything else -- and tracks the CLI-agent jump (Claude Code, Codex) where the same model yields wildly different agent quality depending on scaffolding. His "Get good at agents" thesis -- agents push humans up the org chart, being good at using AI beats working hard -- pairs with concrete workflow advice and a taxonomy of the overloaded term "agent," from tool-use to fully agentic. ChatGPT's turn into the vertical "Agentic App" rounds out the cluster.

Code: green pastures for LLMs

TIER 4 May 25, 2023

Makes the case that code is the highest-value, most sustainable frontier for LLMs and RL: code-in-pretraining likely drives chain-of-thought reasoning, and code's computable correctness (syntax/runs/tests) is an ideal RLCF reward signal, while flagging risks like accumulated technical debt and lagging code-eval tools. A prescient, substantive argument given how central coding later became to AI progress.

code generationCopilotchain-of-thoughtRLCFreasoning

LLM agents and integration dead-ends

TIER 4 Jul 12, 2023

Argues that the real blocker for LLM agents is not capability but robust enterprise integration—security, trust/reliability, and dramatic failure modes when a model controls reputation and finances—surveying the 2023 landscape (ChatGPT Plugins, LangChain, Adept, Lindy) and diagnosing the 'LangChain debacle' as a symptom of the integration wall. A substantive, well-reasoned early analysis whose 'agents are the self-driving cars of digital tech' framing aged well.

LLM agentsenterprise integrationLangChainsecurityRAG

The AI agent spectrum

TIER 4 Dec 18, 2024

A conceptual framework arguing the term 'AI agent' is overloaded and proposing a spectrum from tool-use LMs to orchestration LMs to fully agentic LMs, with a six-step 'agent cartography' of increasing complexity. It repurposes an RL-systems regulation framework (scoping the horizon, defining utility, pruning information, multiple agents) to give agents crisper definitions ahead of the 2025 agent push.

AI agentstaxonomytool useorchestrationframework

Coding as the epicenter of AI progress and the path to general agents

TIER 4 Sep 18, 2025

Argues coding is the last broadly tractable domain of continued frontier progress and the template for how AI capabilities will be built and absorbed elsewhere, with CLI agents (Claude Code, Codex) as the biggest recent capability jump. It notes that product/scaffolding now matters as much as the model (same Claude model, wildly different agent quality) and that coding is shifting toward asynchronous, autonomous PR-generating agents. A substantive, evidence-rich essay on the coding-agent inflection.

coding agentsClaude Code/CodexGPT-5-Codexautonomous PRsAI progress

Thinking, Searching, and Acting

TIER 4 Sep 22, 2025

Proposes a clean three-primitive framework for modern reasoning models — thinking (reasoning traces), searching (non-parametric knowledge), and acting (code/tool execution) — and argues these will outlast static model weights as the durable technology layer. It reframes hallucinations, tokenomics, and the open-vs-closed tool-integration gap through this lens. A genuinely useful conceptual framework that recurs across his other posts.

reasoning modelstool usesearchagentsinference infrastructure

ChatGPT: The Agentic App

TIER 4 Sep 30, 2025

Analyzes OpenAI's 'Buy It in ChatGPT' launch and the Agentic Commerce Protocol as the start of ChatGPT becoming the one vertical 'Agentic App,' arguing that where models act (store networks, APIs) now matters as much as the weights. It frames model specialization (OpenAI's consumer/search vertical vs Anthropic's coding bet) as splitting the industry and reducing the number of model releases. A sharp, timely strategic read on AI monetization and the agentic app paradigm.

agentic appsOpenAImonetization/commercemodel specializationChatGPT

Get Good at Agents

TIER 4 Jan 21, 2026

Lambert argues that applying old work habits to coding agents is fundamentally wrong: the shift to Claude Code with Opus 4.5 changes the question from how to solve a problem to what to work on, pushing humans toward more open-ended, ambitious, asynchronous direction-setting while agents do the hard work in parallel. His thesis — 'agents push the humans up the org chart' and 'being good at using AI is a better moat than working hard' — is paired with a concrete workflow (GPT 5 Pro for planning, Claude Code for implementation) and a curated reading list. Influential framing of the agent-native work transition.

coding agentsClaude Codeagentic workflowsfuture of workproductivity

Opus 4.6, Codex 5.3, and the post-benchmark era

TIER 4 Feb 9, 2026

Comparing Claude Opus 4.6 and GPT-5.3-Codex, Lambert finds Codex 5.3 has become far more Claude-like and is the better top-end coding model, while Opus retains a usability edge for broad, loosely-specified tasks. The larger argument is that we have entered a 'post-benchmark era' where release-day scores barely convey signal (he cites Gemini 3's two-month fall from coronation), validating Anthropic's early bet that real-world agentic gains matter more than evaluation deltas. Worth reading for the post-benchmark thesis and the agent-comparison method.

Claude Opus 4.6Codexcoding agentsbenchmarksmodel evaluation

GPT 5.4 is a big step for Codex

TIER 4 Mar 18, 2026

A hands-on review arguing GPT 5.4 is a meaningful step that puts OpenAI back in the agent wars, removing the 'death by a thousand cuts' (failed git ops, context anxiety) that drove Lambert off prior Codex versions. He contrasts model philosophies: Claude reads intent and has warmth/character while GPT 5.4 is meticulous and precisely instruction-following, suited to the 'master agent coordinator,' and notes OpenAI's edge on rate limits, reasoning efficiency, and context management. Substantive because it articulates the multi-axis (correctness/usability/speed/cost) view of agentic models rather than a single benchmark. ---

GPT 5.4Codexcoding agentsmodel comparisonagentic AI