GITHUB TRENDING
- huggingface/ml-intern — 2,981★ today (5,015★ total) — Autonomous open-source ML engineer agent that reads papers, trains models, and ships ML code using the HuggingFace ecosystem; full agentic loop with doom-loop detection, HF tools, and MCP server integration |
git clone + uv tool install -e .| https://github.com/huggingface/ml-intern
TODAY'S ITEMS
1. Anthropic Publishes Full Engineering Postmortem on 7-Week Quality Regression
- Three bugs are now documented: (1) default reasoning effort silently downgraded from high→medium on March 4, (2) a caching bug that wiped thinking history every single turn after an idle session instead of just once (March 26), (3) a verbosity-reduction system prompt instruction added April 16 that hurt coding quality. All three are fixed in v2.1.116+; usage limits reset for all subscribers.
- Source: Anthropic Engineering
- Why it matters: A single system-prompt line capping verbosity dropped coding evals 3%, confirming that the harness layer — not model weights — is the variable most under your control when Claude Code quality slips.
- Verified
2. Google Commits Up to $40B in Anthropic — $10B Now, $30B on Targets, 5GW Compute
- Alphabet announced this morning: $10B immediate cash at a $350B valuation, up to $30B more tied to performance milestones, and at least 5 gigawatts of Google TPU compute committed over 5 years. Bloomberg, Reuters, and CNBC confirmed simultaneously. This follows Amazon's $25B deal confirmed earlier this month.
- Source: CNBC/Bloomberg/Reuters
- Why it matters: Anthropic now has both Amazon and Google locked into multi-year infrastructure deals, securing the compute runway to build and serve frontier models regardless of what OpenAI does.
- Verified
3. DeepSeek V4 Drops: Cheapest Frontier-Level Pricing, MIT License, 1M Context
- DeepSeek released V4-Pro (1.6T params, 49B active) and V4-Flash (284B, 13B active), both with 1M token context, MIT licensed, on OpenRouter now. Flash prices at $0.14/$0.28 per M tokens — cheapest of the small models, cheaper than GPT-5.4 Nano. Pro at $1.74/$3.48 — cheapest of the large-model options. Simon Willison benchmarked both today; HN #1 with 1,639 points and 1,281 comments.
- Source: Simon Willison
- Why it matters: DeepSeek V4-Flash is now the lowest-cost inference option worth benchmarking against Haiku 4.5 for any batch or cost-sensitive API workflow you can test today via OpenRouter.
- Verified
4. Claude Code v2.1.119: /config Settings Persist, GitLab/Bitbucket PR Support, Hook Timing
- Released last night. Key changes:
/configsettings (theme, editor mode, verbose) now persist to~/.claude/settings.jsonand survive restarts.--from-prnow accepts GitLab merge-request, Bitbucket pull-request, and GitHub Enterprise PR URLs.PostToolUseandPostToolUseFailurehooks now includeduration_ms(tool execution time). NewprUrlTemplatesetting points the footer PR badge at a custom review URL.CLAUDE_CODE_HIDE_CWDhides the working directory from the startup logo. Parallel MCP server reconfiguration for subagents. - Source: GitHub anthropics/claude-code
- Why it matters: If you're running Claude Code against a GitLab or Bitbucket codebase,
claude --from-pr <MR-URL>now works natively — eliminating the manual diff-copy workaround. - Verified
PROMPT OF THE DAY
You are reviewing a Claude Code session configuration for quality regressions.
Audit this project's harness setup and flag any settings that could silently
reduce output quality. Check for:
1. Effort level — what is the current reasoning effort setting? Flag if
"medium" or lower; optimal is "high" or "xhigh".
2. Verbosity controls — does any system prompt or CLAUDE.md instruction
limit response length, tone, or output detail in a way that could hurt
code quality? Flag the exact line.
3. Thinking preservation — are thinking blocks being cleared, compacted,
or summarised mid-session in a way that causes Claude to lose its own
reasoning history?
4. Context trimming — are any hooks or tools truncating conversation history
before Claude completes a task?
For each issue found, state: (a) the likely quality impact, and (b) the
one-line config change that fixes it.
Direct response to today's engineering postmortem — use in Claude Chat to audit your own Claude Code project setup for the three classes of issue Anthropic just documented. Original — not community-sourced.
LANDSCAPE NOTES
r/ClaudeCode conspiracy theory — "Did Anthropic simply unnerf Opus to compete with GPT-5.5?" — 243 upvotes, 89 comments; community questioning whether the postmortem bugs were real or cover for competitive unnerfing. Anthropic's documented evidence is detailed enough to be credible, but the debate is live. https://www.reddit.com/r/ClaudeCode/comments/1sucn83/the_postmortem_did_anthropic_simply_unnerf_opus/
r/LocalLLaMA postmortem response — 808 upvotes, 242 comments; community framing the postmortem as proof that closed-source API providers can silently change quality, arguing for open-weight self-hosted models as the only reliable alternative. https://www.reddit.com/r/LocalLLaMA/comments/1suef7t/anthropic_admits_to_have_made_hosted_models_more/
Claude-Monitor — Swift macOS menu bar app that tracks Claude API usage in real time; prevents mid-session surprise limit hits for anyone running API-heavy workflows. No brew install yet, build from source. https://github.com/balochshees/Claude-Monitor
The Anthropic Stack this morning — "AI agents need to get paid" links Zapier's new AutomationBench (open benchmark for real business workflow completion, better signal than maths olympiad benchmarks for evaluating AI tools) and the n8n resume-screening workflow. https://theanthropicstack.substack.com/p/ai-agents-need-to-get-paid
TRY THIS WEEKEND
Item 1 (required): Now that v2.1.119 ships duration_ms on PostToolUse events, wire up a simple hook that appends tool name + duration to a local CSV log. Run a real session, then open the CSV and see which tools are eating the most time in your workflow. Estimated 30 minutes. Start with the hook docs at https://github.com/anthropics/claude-code/releases/tag/v2.1.119 and check ~/.claude/settings.json for the hooks config key.
TOOL OF THE WEEK
- mimirs — local MCP server that gives Claude Code (and Cursor, Windsurf, Copilot) persistent, searchable memory of your codebase across sessions. One command sets it up:
bunx mimirs init --ide claude. Benchmarked 90–98% Recall@10 across real codebases from 97 to 8,553 files; reduces a typical 380K-token prompt to ~91K. Fully local, free, no accounts. https://github.com/TheWinci/mimirs
TRENDING THIS WEEK
- Anthropic's engineering postmortem — three documented bugs behind the 7-week quality regression, all fixed; the harness layer turned out to be the culprit, not the models. https://www.anthropic.com/engineering/april-23-postmortem
- Google $40B Anthropic investment confirmed this morning — $10B now, $30B on targets, 5GW TPU compute over 5 years. https://www.cnbc.com/2026/04/24/google-to-invest-up-to-40-billion-in-anthropic-as-search-giant-spreads-its-ai-bets.html
- GPT-5.5 launched Thursday — in ChatGPT and Codex now, API access "coming soon," priced at $5/$30 per M tokens. https://openai.com/index/introducing-gpt-5-5/
- DeepSeek V4 dropped today — MIT-licensed, 1M context, Flash at $0.14/$0.28/M tokens, lowest cost on the market. https://simonwillison.net/2026/Apr/24/deepseek-v4/
- free-claude-code hit 8,143★ this week — routes Claude Code through NVIDIA NIM and other free tiers via two env vars with zero client changes. https://github.com/Alishahryar1/free-claude-code