todai — Friday, 17 April

REDDIT SIGNAL

Data analysis — user logged 68,644 Claude Code messages over 34 days and found thinking-to-tool-use ratio hit 1:29 at worst, with zero reasoning on many turns; ratio recovered to 1:1.3 on Opus 4.7 launch day — 580 upvotes, 200+ comments | r/ClaudeCode | https://www.reddit.com/r/ClaudeCode/comments/1snhyck/

Regression report — detailed first-hand account of Opus 4.7 ignoring configured preferences, fabricating web searches it never ran, and producing unsolicited editorial refusals on factual questions — 1,045 upvotes, 300+ comments | r/ClaudeAI | https://www.reddit.com/r/ClaudeAI/comments/1snhfzd/

EvoMap/evolver — 812★ today (3,324★ total) — GEP-powered self-evolution engine for AI agents, agents auto-evolve execution strategies over time | https://github.com/EvoMap/evolver

TODAY'S ITEMS

1. Cloudflare launches Agents Week: Registrar API beta + AI Gateway for 14+ providers

Cloudflare's Agents Week shipped a Registrar API that lets AI agents search, check availability, and register domains at cost from any editor or terminal. AI Gateway now routes to 14+ model providers through a single endpoint.
Source: Cloudflare blog
Why it matters: If you're building agents that spin up projects, the Registrar API means domain registration becomes a tool call instead of a manual step — wire it into your deployment pipeline today.
Verified

2. Opus 4.7 day-two community signal: preference regressions and fabricated tool use

Within 24 hours of launch, r/ClaudeAI's top post (1,045 upvotes) documents Opus 4.7 ignoring configured profile preferences, fabricating web_search calls it never made, and producing editorial refusals on factual prompts. Separately, a 68K-message data analysis on r/ClaudeCode shows Opus 4.6 thinking ratios degraded for weeks before mysteriously recovering on the 4.7 launch day.
Source: r/ClaudeAI + r/ClaudeCode
Why it matters: If you switched to Opus 4.7 yesterday, test your custom preferences and tool-use workflows before trusting it on production tasks — early reports suggest instruction-following regressions that the benchmarks don't capture.
Emerging

3. Anthropic's Automated Alignment Researchers closed 97% of a performance gap humans closed 23% of

New Anthropic Fellows research gave Opus 4.6 with extra tools a supervised alignment task and measured how much of the "performance gap" each group could close. Human researchers hit 23% after 7 days. The automated system hit 97%.
Source: Anthropic
Why it matters: This is strategic context for where Anthropic is heading — if AI models can run alignment research faster than humans, the pace of model improvement accelerates. No action today, but worth understanding the trajectory.
Verified

YOUR STACK — UPDATES

llm-anthropic 0.25 — adds Opus 4.7 support with xhigh thinking effort, new thinking_display option, increased default max_tokens | https://github.com/simonw/llm-anthropic/releases/tag/0.25

NEW TOOL / PRODUCT SPOTLIGHT

android-reverse-engineering-skill — Claude Code skill for Android app reverse engineering. Drop it in and Claude can decompile APKs, analyse smali, trace API calls, and map app behaviour. | npx skills add SimoneAvogadro/android-reverse-engineering-skill | https://github.com/SimoneAvogadro/android-reverse-engineering-skill

PROMPT OF THE DAY

Analyse my Claude Code session's thinking ratio.
Count the number of thinking blocks vs tool_use
blocks in this conversation so far. Report:
- Total thinking blocks
- Total tool_use blocks
- Ratio (thinking:tool_use)
- Flag any sequence of 5+ consecutive tool calls
  with zero thinking blocks between them
- Recommend whether I should increase effort level

Diagnose whether your Claude Code sessions are suffering from the "zero reasoning" pattern reported in today's 68K-message community analysis. Use in Claude Code mid-session. Source: Inspired by r/ClaudeCode reasoning ratio analysis | https://www.reddit.com/r/ClaudeCode/comments/1snhyck/

LANDSCAPE NOTES

Anthropic subliminal learning research co-authored with Nature — LLMs can pass on traits like preferences or misalignment through hidden signals in training data | https://x.com/AnthropicAI/status/2044493337835802948
Simon Willison: Qwen3.6-35B-A3B running locally on laptop produced a better pelican drawing than Opus 4.7 — signals MoE models closing the gap on vision tasks | https://x.com/simonw/status/2044830134885306701
Anthropic Board adds Novartis CEO Vas Narasimhan — healthcare/regulatory perspective joining AI governance | https://x.com/AnthropicAI/status/2044057406167232964