todai — Saturday, 25 April

TODAY'S ITEMS

1. Anthropic Publishes Full Engineering Postmortem on 7-Week Quality Regression

Three bugs are now documented: (1) default reasoning effort silently downgraded from high→medium on March 4, (2) a caching bug that wiped thinking history every single turn after an idle session instead of just once (March 26), (3) a verbosity-reduction system prompt instruction added April 16 that hurt coding quality. All three are fixed in v2.1.116+; usage limits reset for all subscribers.
Source: Anthropic Engineering
Why it matters: A single system-prompt line capping verbosity dropped coding evals 3%, confirming that the harness layer — not model weights — is the variable most under your control when Claude Code quality slips.
Verified

2. Google Commits Up to $40B in Anthropic — $10B Now, $30B on Targets, 5GW Compute

Alphabet announced this morning: $10B immediate cash at a $350B valuation, up to $30B more tied to performance milestones, and at least 5 gigawatts of Google TPU compute committed over 5 years. Bloomberg, Reuters, and CNBC confirmed simultaneously. This follows Amazon's $25B deal confirmed earlier this month.
Source: CNBC/Bloomberg/Reuters
Why it matters: Anthropic now has both Amazon and Google locked into multi-year infrastructure deals, securing the compute runway to build and serve frontier models regardless of what OpenAI does.
Verified

3. DeepSeek V4 Drops: Cheapest Frontier-Level Pricing, MIT License, 1M Context

DeepSeek released V4-Pro (1.6T params, 49B active) and V4-Flash (284B, 13B active), both with 1M token context, MIT licensed, on OpenRouter now. Flash prices at $0.14/$0.28 per M tokens — cheapest of the small models, cheaper than GPT-5.4 Nano. Pro at $1.74/$3.48 — cheapest of the large-model options. Simon Willison benchmarked both today; HN #1 with 1,639 points and 1,281 comments.
Source: Simon Willison
Why it matters: DeepSeek V4-Flash is now the lowest-cost inference option worth benchmarking against Haiku 4.5 for any batch or cost-sensitive API workflow you can test today via OpenRouter.
Verified

4. Claude Code v2.1.119: /config Settings Persist, GitLab/Bitbucket PR Support, Hook Timing

Released last night. Key changes: /config settings (theme, editor mode, verbose) now persist to ~/.claude/settings.json and survive restarts. --from-pr now accepts GitLab merge-request, Bitbucket pull-request, and GitHub Enterprise PR URLs. PostToolUse and PostToolUseFailure hooks now include duration_ms (tool execution time). New prUrlTemplate setting points the footer PR badge at a custom review URL. CLAUDE_CODE_HIDE_CWD hides the working directory from the startup logo. Parallel MCP server reconfiguration for subagents.
Source: GitHub anthropics/claude-code
Why it matters: If you're running Claude Code against a GitLab or Bitbucket codebase, claude --from-pr <MR-URL> now works natively — eliminating the manual diff-copy workaround.
Verified

PROMPT OF THE DAY

You are reviewing a Claude Code session configuration for quality regressions.

Audit this project's harness setup and flag any settings that could silently
reduce output quality. Check for:

1. Effort level — what is the current reasoning effort setting? Flag if
   "medium" or lower; optimal is "high" or "xhigh".
2. Verbosity controls — does any system prompt or CLAUDE.md instruction
   limit response length, tone, or output detail in a way that could hurt
   code quality? Flag the exact line.
3. Thinking preservation — are thinking blocks being cleared, compacted,
   or summarised mid-session in a way that causes Claude to lose its own
   reasoning history?
4. Context trimming — are any hooks or tools truncating conversation history
   before Claude completes a task?

For each issue found, state: (a) the likely quality impact, and (b) the
one-line config change that fixes it.

Direct response to today's engineering postmortem — use in Claude Chat to audit your own Claude Code project setup for the three classes of issue Anthropic just documented. Original — not community-sourced.

LANDSCAPE NOTES

r/ClaudeCode conspiracy theory — "Did Anthropic simply unnerf Opus to compete with GPT-5.5?" — 243 upvotes, 89 comments; community questioning whether the postmortem bugs were real or cover for competitive unnerfing. Anthropic's documented evidence is detailed enough to be credible, but the debate is live. https://www.reddit.com/r/ClaudeCode/comments/1sucn83/the_postmortem_did_anthropic_simply_unnerf_opus/
r/LocalLLaMA postmortem response — 808 upvotes, 242 comments; community framing the postmortem as proof that closed-source API providers can silently change quality, arguing for open-weight self-hosted models as the only reliable alternative. https://www.reddit.com/r/LocalLLaMA/comments/1suef7t/anthropic_admits_to_have_made_hosted_models_more/
Claude-Monitor — Swift macOS menu bar app that tracks Claude API usage in real time; prevents mid-session surprise limit hits for anyone running API-heavy workflows. No brew install yet, build from source. https://github.com/balochshees/Claude-Monitor
The Anthropic Stack this morning — "AI agents need to get paid" links Zapier's new AutomationBench (open benchmark for real business workflow completion, better signal than maths olympiad benchmarks for evaluating AI tools) and the n8n resume-screening workflow. https://theanthropicstack.substack.com/p/ai-agents-need-to-get-paid

TRY THIS WEEKEND

Item 1 (required): Now that v2.1.119 ships duration_ms on PostToolUse events, wire up a simple hook that appends tool name + duration to a local CSV log. Run a real session, then open the CSV and see which tools are eating the most time in your workflow. Estimated 30 minutes. Start with the hook docs at https://github.com/anthropics/claude-code/releases/tag/v2.1.119 and check ~/.claude/settings.json for the hooks config key.

TOOL OF THE WEEK

mimirs — local MCP server that gives Claude Code (and Cursor, Windsurf, Copilot) persistent, searchable memory of your codebase across sessions. One command sets it up: bunx mimirs init --ide claude. Benchmarked 90–98% Recall@10 across real codebases from 97 to 8,553 files; reduces a typical 380K-token prompt to ~91K. Fully local, free, no accounts. https://github.com/TheWinci/mimirs

TRENDING THIS WEEK

Anthropic's engineering postmortem — three documented bugs behind the 7-week quality regression, all fixed; the harness layer turned out to be the culprit, not the models. https://www.anthropic.com/engineering/april-23-postmortem
Google $40B Anthropic investment confirmed this morning — $10B now, $30B on targets, 5GW TPU compute over 5 years. https://www.cnbc.com/2026/04/24/google-to-invest-up-to-40-billion-in-anthropic-as-search-giant-spreads-its-ai-bets.html
GPT-5.5 launched Thursday — in ChatGPT and Codex now, API access "coming soon," priced at $5/$30 per M tokens. https://openai.com/index/introducing-gpt-5-5/
DeepSeek V4 dropped today — MIT-licensed, 1M context, Flash at $0.14/$0.28/M tokens, lowest cost on the market. https://simonwillison.net/2026/Apr/24/deepseek-v4/
free-claude-code hit 8,143★ this week — routes Claude Code through NVIDIA NIM and other free tiers via two env vars with zero client changes. https://github.com/Alishahryar1/free-claude-code

GITHUB TRENDING