Tuesday, 21 April
4 min read · 748 words

1. Opus 4.7 tokenizer inflates API costs ~40% — same price per token, more tokens per request

  • Simon Willison measured Opus 4.7's new tokenizer against 4.6 and found 1.46x more tokens for the same text input — at identical per-token pricing ($5/M in, $25/M out), that's a real 40%+ cost increase per request. High-res images are unaffected for small images (same token count at equivalent resolution) but 4.7's higher-res support means large images cost proportionally more.
  • Source: simonwillison.net
  • Why it matters: If you're calling Opus 4.7 via API in production, your token budget from 4.6 will not hold — re-benchmark against real payloads now before your next invoice surprises you. Use Willison's token counter tool to compare: https://tools.simonwillison.net/claude-token-counter
  • Verified

2. Claude Code (Apr 17): /recap, /ultrareview, and /less-permission-prompts shipped

  • Three new commands worth enabling today: /recap auto-briefs you (and Claude) on what you were doing when you return to a session after 5+ minutes — eliminates the re-orientation cost of context switching. /ultrareview runs parallel multi-agent code review in the cloud for your current branch or a GitHub PR. /less-permission-prompts scans your transcript history and auto-proposes an allowlist for .claude/settings.json to reduce prompt interruptions.
  • Source: GitHub claude-code releases
  • Why it matters: /less-permission-prompts alone saves significant friction in established projects — run it once per project and Claude builds the allowlist from your actual usage patterns, not a generic template. /ultrareview is the fastest path to catching what you missed before merging.
  • Verified

3. Kimi K2.6 drops on HuggingFace: 80.2% SWE-bench Verified, beats Opus 4.6 on coding benchmarks

  • Moonshot AI released Kimi K2.6 directly to HuggingFace — trillion-parameter MoE architecture, 256K context, native multimodal, Modified MIT licence. It leads on SWE-bench Pro (58.6%) vs competing models, and multiple reviewers are benchmarking it as comparable to Claude Sonnet 4.6 for long research tasks with many tool calls, at a fraction of the cost.
  • Source: HuggingFace + r/LocalLLaMA | https://www.reddit.com/r/LocalLLaMA/comments/1sqscao/kimi_k26_released_huggingface/
  • Why it matters: This is the strongest open-weight coding model yet on standard benchmarks — if you're evaluating alternatives to the Claude API for cost reduction or on-prem deployment, this warrants a two-week trial on your actual codebase.
  • Emerging

You are a confident, direct collaborator. For this task, trust
your initial analysis and proceed with implementation when you
have enough context. Uncertainty is expected — state it briefly
and pick the most reasonable path rather than asking for
clarification. Do not re-verify each step before acting. I will
correct you if you go off-track. Minimise self-checking loops
and work with the assumption that I trust your judgment.

Paste this at the top of your CLAUDE.md or any complex Opus 4.7 session — addresses the community-confirmed pattern where 4.7's internal self-checking degrades output quality. Source: r/ClaudeCode thread https://www.reddit.com/r/ClaudeCode/comments/1sqtdqe/

  • Salesforce Headless 360 (TDX 2026): entire Salesforce platform now exposed as 60+ MCP tools and CLI commands — Claude Code, Cursor, and Codex all have live access to Salesforce data and workflows with no browser required. If your team builds on Salesforce, this is a significant change for agentic automation. https://venturebeat.com/technology/salesforce-launches-headless-360-to-turn-its-entire-platform-into-infrastructure-for-ai-agents
  • Anthropic subliminal learning paper in Nature (co-authored): LLMs trained on distilled data can inherit behavioral traits — including misalignments — through semantically unrelated content, even when screened. Affects synthetic data and fine-tuning strategies if you're training on Claude-generated outputs. https://www.nature.com/articles/s41586-026-10319-8
  • Anthropic Automated Alignment Researcher: Claude Opus 4.6 with tools closed 97% of a key alignment performance gap in 7 days vs 23% for human researchers — published as a research milestone, not a product launch, but signals how Anthropic is applying Claude to its own safety work. https://www.anthropic.com/research