BREAKING: Zuck just announced that Meta will be open sourcing Llama 2 with Microsoft
Meta's release of Llama 2 as freely licensed open weights — the foundational event for the entire open-source LLM ecosystem that followed.
News & Digests
470 issues · 58 keepers · 12 tier-5 · 46 tier-4
4 tier-5 · 7 tier-4
The release cadence that defined the era — each new flagship resetting the capability ceiling and the competitive map. Llama's open weights opened a parallel ecosystem; Claude and Gemini turned a one-horse race into a three-lab frontier; o1 reframed the whole roadmap around test-time compute; and by GPT-5.2 the launches had become explicit moves in a head-to-head with Google. Read in order, these issues trace the arc from "OpenAI's lead is unassailable" to "the frontier is a contested, multi-lab board."
Meta's release of Llama 2 as freely licensed open weights — the foundational event for the entire open-source LLM ecosystem that followed.
Gemini 1.0 was Google's flagship multimodal model debut, the foundation of its entire Gemini product line.
The Bard-to-Gemini rebrand plus Ultra 1.0 and a dedicated app consolidated Google's consumer AI under one durable brand.
Claude 3 (Opus/Sonnet/Haiku) launch put Anthropic at the frontier and established the model family still iterated on today.
GPT-4o was a landmark omni-modal flagship that brought real-time voice/vision and became OpenAI's default model for years.
Release of Claude 3.5 Sonnet, the model that established Anthropic as a frontier coding/reasoning competitor to GPT-4.
Meta's Llama 3.1 405B, the first freely downloadable GPT-4-class model, a landmark for open-weight AI.
Launch of o1-preview, the first reasoning/test-time-compute model, which redefined the frontier and the whole industry's roadmap.
Launch of GPT-4.5, OpenAI's largest pre-training-scaled model, a notable marker of the diminishing-returns turn that pushed the field toward reasoning models.
Google launches Gemini 3 powering both the Gemini app and AI Mode in Search, claiming 72% on a key benchmark, with 650M MAU closing on OpenAI's 700M weekly users and free Pro for US students, positioning it to convert search dominance into AI dominance amid bubble questions. Also covers Bezos co-leading the $6.2B 'physical AI' Project Prometheus, and Inception Point AI mass-producing ~3,000 AI podcast episodes/week at ~$1 each. The flagship Gemini 3 launch is one of the most consequential events in the batch.
OpenAI launches GPT-5.2 (Instant, Thinking, Pro tiers) with gains in tool use, long-context, coding, math, and agentic workflows, explicitly framed as a response to Gemini's surge amid an internal 'code red' and compute-cost pressure. Also covers Google's Disco tool generating web apps from browser tabs and Disney's three-year Sora licensing deal plus $1B equity stake covering 200+ characters. A meaningful frontier-model launch tied to the competitive-pressure narrative, above routine.
0 tier-5 · 7 tier-4
The shift from "model you chat with" to "model that acts." Devin named the AI-software-engineer category; Computer Use let a model drive a desktop; Operator and Deep Research productized autonomy for consumers; the unified ChatGPT Agent fused them into one virtual computer; and Apps-in-ChatGPT plus the Atlas browser pushed the agent into the place where work actually happens — third-party apps and the open web. Together they document the single biggest product-architecture change of the period, along with its recurring unsolved problem: prompt injection and irreversible actions.
Cognition's Devin debut defined the 'AI software engineer' category and kicked off the coding-agent race.
Anthropic's Computer Use debut, the first major model able to operate a desktop directly, a foundational moment for agentic AI.
Launch of Operator, OpenAI's first consumer computer-use/browser agent, an early landmark in the autonomous-agent product wave.
Debut of Deep Research, the agentic-research product category that became a standard offering across all major labs.
OpenAI's ChatGPT Agent merges Operator and Deep Research into one model with its own virtual computer that autonomously runs multi-step tasks (calendars, research, shopping), trading speed for capability with permission prompts on irreversible actions. Also: OpenAI and Google DeepMind models both win gold-level scores at the International Math Olympiad using general-purpose reasoning, and Decart's Mirage warps live video in real time.
OpenAI is embedding third-party interactive apps (Booking, Spotify, Figma, Coursera, Zillow, Canva) directly into ChatGPT via an MCP-based Apps SDK, turning the chatbot into a platform/OS with planned Instant Checkout monetization. Strong secondary item: Microsoft's red team showed generative AI can redesign toxins to evade DNA-synthesis screening, a biosecurity 'zero-day.' The platform shift plus the bioweapons-screening finding both carry real weight.
OpenAI launched ChatGPT Atlas, a macOS AI-first browser with a page-aware 'sidecar' chatbot and browsing-history memory, positioning ChatGPT as the primary search surface against Chrome and rival AI browsers Comet and Dia. The issue flags prompt-injection as a major unresolved security risk and adds Amazon's AR smart glasses for delivery drivers. A meaningful product launch in the contested browser-as-agent battleground.
2 tier-5 · 6 tier-4
The human and structural drama behind the labs. The five-day November-2023 board crisis — Altman out, then back — remains the most consequential AI corporate-governance event of the era; the Musk suit reopened the nonprofit-vs-for-profit question that the crisis exposed; and the financing arc runs all the way to a planned $1T IPO. Around OpenAI, the same forces churn the rest of the industry: Meta's failed bid for Sutskever's SSI, LeCun's departure to bet against the LLM playbook, and the "code red" that admitted Google had caught up.
Captures the peak of the November 2023 OpenAI board crisis — the most consequential AI corporate-governance event of the era, when the board ousted Altman before reversing days later.
The resolution of the OpenAI board coup — Altman reinstated with a reconstituted board — a defining moment that shaped OpenAI's structure and the AI industry's power dynamics.
Altman's official reinstatement closed the November 2023 board crisis, a defining governance event in AI history.
Musk's lawsuit against OpenAI over its for-profit shift opened a multi-year legal and governance saga still unfolding.
Meta's $32B bid for Sutskever's Safe Superintelligence was rebuffed, so Zuckerberg recruited CEO Daniel Gross and Nat Friedman and took a stake in their NFDG fund instead. The strong companion piece is Andrej Karpathy's 'Software 3.0' talk (LLMs as the new code authored in English, autonomy sliders, agent-friendly infra), plus OpenAI warning its next reasoning models could give novices bioweapon uplift.
OpenAI is reportedly laying groundwork for a late-2026/2027 IPO that could value it up to $1T, post-restructuring with the Foundation holding 26% plus a milestone warrant, off a ~$20B revenue run rate with rising losses. Secondary coverage includes Pluribus mastering six-player no-limit poker via self-play and a surge of AI-generated fake receipts driving expense fraud. The IPO and AI-fraud items both have real downstream stakes.
FT reports Turing winner Yann LeCun is preparing to leave Meta to found a startup built on his non-LLM JEPA world-model approach, after friction over publication controls and a product-led shift at FAIR. A high-profile bet against the LLM-centric playbook that could redirect capital and talent toward alternative architectures. Lead also covers SoftBank dumping its entire $5.8B Nvidia stake to go all-in on AI infrastructure, plus an AI-resolved 25-year Crohn's NOD2-girdin mechanism.
Details OpenAI's 'code red' to refocus on core ChatGPT quality (delaying ads, health/shopping agents, and Pulse) while building a specialized model codenamed Garlic, as Gemini's MAU jumped from 450M (July) to 650M (Oct) and OpenAI faces ~$200B-by-2030 revenue pressure. Also covers Google's data-advantage-driven personalization and Anthropic tapping IPO lawyers toward a 2026 listing at a $300-350B valuation. The clearest articulation of the competitive-and-financial inflection in the batch.
1 tier-5 · 6 tier-4
Where the capability story stops being about chat and starts being about discovery. The 2024 Nobel in Physics recognized the field's scientific roots; then came a run of concrete applied results — diagnostic models beating physician panels, predicting two decades of disease risk, catching pancreatic cancer years early, an original math proof, and gold-medal performance at the world's hardest programming and math contests. The throughline is AI moving from remixing known knowledge toward generating new findings.
The 2024 Nobel Prize in Physics awarded to Hopfield and Hinton for foundational neural-network work, a historic recognition of AI's scientific roots.
Microsoft's MAI-DxO orchestrator, paired with OpenAI's o3, correctly diagnosed 85.5% of the toughest NEJM cases versus 20% for 21 practicing physicians, while also lowering testing costs via a virtual panel of AI agents. Also: China's first fully autonomous AI robot football match (humanoids stumbling and needing stretchers), and Meta's new Superintelligence Labs under Alexandr Wang and Nat Friedman.
OpenAI's GPT-5-led ensemble solved all 12 ICPC World Finals 2025 problems under standard five-hour rules, beating top human teams (best 11/12) and DeepMind's Gemini (10/12), continuing its IMO/IOI gold-level streak. The issue pairs this with Meta's Ray-Ban Display glasses plus an sEMG Neural Band for silent air-typing, and Gemini's integration into Chrome with agentic browsing. A genuine capability milestone alongside two notable hardware/browser launches.
Delphi-2M, a modified GPT model trained on 400,000 UK Biobank participants, estimates 20-year risk across 1,258 diseases, often matching or beating single-disease predictors and validating on 1.9M Danes with only slight accuracy loss. Secondary items: ManticAI placing 8th in the Metaculus forecasting cup and Stargate's expansion to ~7GW and >$400B. The disease-prediction model and the forecasting result are both substantive applied-AI findings.
Mayo Clinic's REDMOD model detected hidden pancreatic cancer on routine CT scans up to three years before diagnosis (73% of prediagnostic cases at a median ~16 months out, nearly tripling specialist detection >2 years prior), now moving into the prospective AI-PACED trial. Also covers a journalist stood up by an AI-run company's agent (tracking the owner to China) and Google Photos' AI digital closet. The early-detection result is a concrete, high-impact medical-AI finding that lifts this above the daily-news baseline.
OpenAI claims a general-purpose reasoning model produced an original proof disproving a 1946 Erdős geometry conjecture, this time backed by named mathematicians (Noga Alon, Melanie Wood, Thomas Bloom) after an earlier debunked claim. Also covers Altman's $2M-OpenAI-credits-for-equity offer to every YC startup and OpenAI's imminent confidential IPO filing. If the proof holds, it is a notable signal of AI expanding rather than remixing knowledge, earning the higher tier.
Hassabis says AGI could realistically arrive by 2029, citing 'soft self-improvement' from coding and research agents, and warns policymakers and economists are badly underestimating the pace. Paired with a substantive piece on Google's Co-Scientist and ERA systems generating and ranking hypotheses for drug repurposing and cancer detection. The combination of a named frontier-lab timeline plus concrete AI-for-science results lifts this above routine roundup.
1 tier-5 · 3 tier-4
The economics underneath the capability story. Stargate marked the start of the half-trillion-dollar compute era; by late 2025 the buildout had become an explicitly national-strategy, trillion-dollar question — even reaching for solar data centers in space — while bubble fears crystallized around circular vendor financing and concentration risk. These four issues are the financial through-narrative that the model and agent launches ride on top of.
Announcement of the $500B Stargate AI-infrastructure initiative (OpenAI/SoftBank/Oracle), a defining marker of the compute-buildout era.
A synthesis of mounting AI-bubble fears: circular 'vendor financing' among OpenAI, Nvidia, AMD, Microsoft and Oracle, ~$1.5T projected 2025 AI spend, AI names driving ~80% of US market gains, and warnings from BoE, IMF and Jamie Dimon. Secondary items include a Jony Ive-Altman interface conversation and Thinking Machines co-founder Andrew Tulloch decamping to Meta. The bubble analysis is the most analytically useful piece in this window.
Google's Project Suncatcher proposes TPU-equipped satellite constellations running AI workloads on near-continuous solar power, with two test satellites targeted for 2027 and possible economic viability by 2035. The piece pairs it with Michael Burry's ~$1.1B bet against Palantir/Nvidia and an Amazon vs Perplexity cease-and-desist over agentic browsing that previews a 'robots.txt for agents.' The space-compute angle plus the agentic-web precedent give this issue above-average signal.
Altman frames OpenAI's ~$1.4T, 8-year compute buildout as national strategy: rejecting taxpayer backstops, backing a government-owned strategic compute reserve and loan guarantees only for US chip fabs, against a >$20B ARR run rate. Notable secondary items: Microsoft's new MAI Superintelligence Team under Suleyman, and Moonshot's open-source Kimi K2 Thinking claiming near-frontier agentic performance at <$5M training cost. The China open-source angle and the trillion-dollar financing question make this a substantive issue.
2 tier-5 · 12 tier-4
How AI reached a billion people — and what showed up when it did. The product milestones (ChatGPT iOS, Copilot across Windows/Office, multimodal voice and vision, DevDay's GPTs platform, the GPT Store, Apple Intelligence, SearchGPT) put generative AI into mainstream software; then the empirical and human-cost stories arrive — the largest real-world usage study, the calibration explanation for why chatbots still hallucinate, the productivity gap between what CEOs claim and what workers feel, and the public-health disclosure that over a million people a week discuss suicide with ChatGPT. This is the theme where the technology meets ordinary life.
Launch of OpenAI's first official ChatGPT mobile app, a durable product-history milestone in ChatGPT's expansion beyond the web.
Microsoft Build 2023 unveiled Windows Copilot (and Adobe shipped Photoshop Generative Fill), both durable launches that embedded generative AI into mainstream software.
Microsoft's unveiling of Copilot across Windows and Office — the landmark mass-productization of generative AI into mainstream consumer and enterprise software.
ChatGPT's rollout of vision (GPT-4V) and voice — the multimodal milestone that turned ChatGPT from a text tool into a see/hear/speak assistant.
First-ever OpenAI DevDay — launched GPTs, the GPT Store, the Assistants API, and GPT-4 Turbo, a landmark platform moment that reshaped how developers build on LLMs.
GPT Store launch marked OpenAI's first attempt at a custom-GPT app-store ecosystem and the ChatGPT Team tier.
Sora's unveiling was the defining text-to-video moment of 2024 and reset expectations for generative video.
I/O 2024 introduced Project Astra and Veo, marking Google's full pivot to an AI-first product strategy.
Apple's WWDC 2024 unveiling of Apple Intelligence, the platform shift bringing generative AI to a billion-plus devices and the OpenAI/Siri partnership.
SearchGPT, OpenAI's first direct challenge to Google Search, marking the start of AI-native search competition.
OpenAI argues hallucinations persist because accuracy-only evals reward confident guessing and penalize uncertainty, proposing to redesign benchmarks to reward calibrated abstention, contrasting Claude's more cautious style. Secondary stories: Monumental Labs scaling robotic stone carving toward structural building blocks, and OpenAI backing the AI-made animated film 'Critterz' for a Cannes-bound sub-$30M production. The eval-incentives explanation is a genuinely useful conceptual framing.
OpenAI's 62-page analysis of 1.1M+ chats shows ChatGPT is used mostly for practical guidance (28.3%) and writing, with non-work chats at 73%, a younger and now majority-feminine user base, faster growth in poorer countries, and companionship a niche 1.9%. Secondary items include Thinking Machines' work on defeating LLM nondeterminism for reproducible inference and OpenAI ramping humanoid robotics. The usage-data study is a rare empirical look at real-world adoption.
OpenAI disclosed that ~0.15% of ChatGPT's 800M weekly users (over a million people) show explicit suicidal planning, alongside GPT-5 safety gains (91% compliance vs 77%) and new youth protections, amid lawsuits and a plan to loosen adult-content rules. Also covers a permission-aware 'company knowledge' enterprise mode and Bill Gates predicting a 2-day workweek within a decade. The mental-health disclosure is a genuine public-health data point that elevates this issue.
Surfaces a real perception gap: executives claim AI saves 8+ hours/week while most nonmanagers report under 2 hours or none, citing an 'AI tax' of fixing errors and redoing work, with even CEOs admitting limited financial payoff so far. Also covers Anthropic having to redesign its technical interview as Claude beats top candidates, and startup Humans& betting that coordination (not chat) is the next frontier. The productivity-gap data is genuinely useful and transferable beyond daily AI churn.
0 tier-5 · 4 tier-4
The fights over rules, rights, and red lines. The NYT copyright suit set the terms for whether training on news is fair use; the DeepSeek distillation accusation became a flashpoint in US-China competition; the Trump AI Action Plan swung federal policy hard toward deregulated, national-security-driven deployment; and Anthropic's quiet swap of its binding Responsible Scaling Policy for nonbinding "public goals" — mid-fight with the Pentagon — showed how fragile self-imposed safety commitments can be under commercial and political pressure.
OpenAI's formal response in the NYT copyright suit is part of the landmark case defining whether AI training on news is fair use.
OpenAI's distillation accusation against DeepSeek, a defining episode in the US-China AI competition and the distillation-IP debate.
The White House AI Action Plan aggressively rolls back regulation, ties federal contracts to models 'free from ideological bias,' calls for nuclear/geothermal grid upgrades for data centers, and tightens chip export controls against China, a major policy pivot toward growth and national-security-driven deployment. Secondary items: an OpenAI investor's apparent ChatGPT-related mental-health crisis, and AI devising counterintuitive physics experiments (LIGO redesign) that work.
Anthropic replaced its binding Responsible Scaling Policy — removing the commitment to pause training if capabilities outpace safety — with nonbinding 'public goals,' even as it resists Pentagon pressure to drop red lines on AI weapons and surveillance. Secondary items: Jack Dorsey cutting Block's headcount nearly in half citing AI, and Uber engineers building a 'Dara AI' clone of their CEO to rehearse pitches (~90% of Uber engineers use AI).
2 tier-5 · 1 tier-4
The bets that sit outside the language-model mainline. Neuralink's first-in-human approval pushed brain-computer interfaces into the clinic; Apple's Vision Pro defined the spatial-computing era; and Microsoft's Majorana 1 claimed a topological-qubit breakthrough beyond the AI news churn. Small in count but the widest in scope — these are the issues that look past the current paradigm.
Neuralink's FDA clearance for its first human brain-implant trial — a milestone regulatory step in brain-computer interfaces still referenced years later.
Day-of reveal of Apple's Vision Pro mixed-reality headset at WWDC 2023 — a landmark hardware launch that defined the spatial-computing era.
Microsoft's Majorana 1, claimed first topological-qubit quantum chip, a landmark (if contested) hardware milestone beyond the AI news churn.