Thursday, 4 June
4 min read · 735 words

1. Uber Caps Claude Code Spending at $1,500/Month Per Tool

Uber is limiting every employee to $1,500 in monthly token spending per agentic coding tool including Claude Code and Cursor, according to Bloomberg. Simon Willison notes the cap works out to ~11% of the median $330K engineer compensation package — a pragmatic response after Uber burned its 2026 AI budget in four months.

Source: Simon Willison / Bloomberg | https://simonwillison.net/2026/Jun/3/uber-caps-usage/ Why it matters: First public per-tool token spending cap from a major enterprise — expect other large engineering orgs to follow. Confidence: Verified

2. VSCode Zero-Day Steals GitHub Tokens in One Click (Unpatched)

A VS Code vulnerability lets attackers steal GitHub OAuth tokens — including read/write access to private repos — when a user clicks a single crafted github.dev link. Researcher Ammar Askar published a working exploit and bypassed Microsoft's security response process; the bug is currently unpatched.

Source: HN (607 points) | https://blog.ammaraskar.com/github-token-stealing/ Why it matters: Anyone who opens PRs or reviews code on github.dev is exposed — avoid unfamiliar github.dev links until Microsoft patches. Confidence: Verified

3. Google Releases Gemma 4 12B: Open Multimodal AI on Your Laptop

Google DeepMind released Gemma 4 12B, an encoder-free open-weight model handling text, images, and audio in a single architecture. Runs on one laptop GPU (16 GB VRAM or unified memory) and targets local agentic workflows.

Source: Google Blog | https://blog.google/innovation-and-ai/technology/developers-tools/introducing-gemma-4-12b/ Why it matters: A 12B multimodal model that runs locally fills the gap between the 4B models and the 26B+ that need dedicated hardware. Confidence: Verified

  • CC v2.1.161: a failed Bash command in parallel tool calls no longer cancels the whole batch — each tool returns its result independently. Also: claude mcp now redacts credential headers and secrets, OpenTelemetry labels flow into per-team/repo usage metrics, and claude agents rows show done/total progress | https://github.com/anthropics/claude-code/releases/tag/v2.1.161

  • Workflow — keep a parent folder Claude always accesses, then ask it to scaffold each new project inside — no manual setup, consistent structure across everything — 166 upvotes, 32 comments | r/ClaudeAI | https://old.reddit.com/r/ClaudeAI/comments/1tv63j9/
  • Signal — Graphify (CC-native knowledge-graph tool) accepted into Y Combinator — community debating whether the CC skills/plugins ecosystem is becoming a VC-funded market — 77 upvotes, 39 comments | r/ClaudeCode | https://old.reddit.com/r/ClaudeCode/comments/1tv2wv5/
  • herdr — 1,327★ this week (4,067★ total) — Rust terminal agent multiplexer; run Claude Code, Codex, Gemini CLI side-by-side from one terminal | brew install herdr | https://github.com/ogulcancelik/herdr

  • Supermemory — Fast semantic memory engine for AI applications. Personal or organisational knowledge search, works as a standalone app or plugs into any agent pipeline via API or MCP server. Self-hostable, MIT licensed. 25,071★ | npm install supermemory | https://github.com/supermemoryai/supermemory

You are building an AI agent for me that can complete ONE job reliably.
Job: <one sentence>  Inputs: <list>  Outputs: <list>
Tools: <firecrawl | browser-use | composio | other>
Environment: <Claude Code | Cursor | CLI>
Constraints: <time, cost, no external calls, etc.>

1. Ask clarifying questions. Propose defaults for ambiguity.
2. Minimal architecture: tool calls, data flow, failure modes.
3. Minimal working version: files, config, one test case.
4. Setup commands, config, code, verification.
5. Three expansion improvements with reasons and tests.

No speculative code. Every function testable. Handle tool failures.

Force a structured plan before any code — agents that ship, not agents that "almost work." Source: The Anthropic Stack https://theanthropicstack.substack.com/p/build-an-ai-agent-that-actually-works