arXiv began enforcing a new policy this week: submit a paper with AI-hallucinated citations and you're banned from the platform for a year, after which future preprints require peer-review acceptance before posting. With fabricated citations rising tenfold since 2023 — now appearing in 1 in 277 papers — arXiv's response is to repurpose the peer-review gate that most researchers treat as optional into a punitive instrument.
Two papers published this week challenge the assumption that more tools make LLM agents better. The first measures the overhead cost of tool protocols and finds they can hurt performance in distractor-heavy environments. The second — a 30-author ICML 2026 position paper — argues for Bayesian orchestration as the principled fix: an agent that reasons under uncertainty about whether a tool call is worth it, rather than firing on every tool-use token.
A new paper from a mix of academic and industry researchers identifies why diffusion language models consistently trail their autoregressive counterparts despite strong theoretical properties: they don't agree with what they generate. The proposed fix — Introspective Strided Decoding — lets an 8B DLM match same-scale AR quality while running 2.9–4.1x faster at high concurrency.