Security · AI Beat

15 Jul 2026 · AI Beat Desk

Cursor and the Attack Surface You Agreed To

Two independent security disclosures landed within hours of each other about Cursor IDE: Mindgard's finding that Cursor auto-executes any git.exe in a repo root (still unpatched after 7 months) and Cato Networks' DuneSlide research showing that prompt injection via MCP or web search can escape the agent sandbox and achieve full OS-level RCE. Together they define a new class of attack surface that appears whenever an AI agent runs with your privileges.

13 Jul 2026 · AI Beat Desk

What Grok Build Uploads

A wire-level analysis of Grok Build CLI 0.2.93 found it uploads the entire workspace as a git bundle to Google Cloud Storage — about 5.1 GiB from a 12 GB repo, including files the agent never read and unredacted .env credentials. The model itself received 192 KB. The "Improve the model" toggle does not stop the upload.

08 Jul 2026 · AI Beat Desk

Seven Bugs in a Crypto Library

zkSecurity ran their AI audit pipeline against Cloudflare's CIRCL experimental crypto library and found seven genuine vulnerabilities — from float64 precision loss in threshold RSA to a full CP-ABE access-control break. The piece is as valuable for what it reveals about AI's specific blind spots in cryptographic reasoning as for the bugs themselves.

04 Jul 2026 · AI Beat Desk

The Bug-Finding Numbers Land

Epoch.ai tracked CVE disclosures from 21 major organizations and found June 2026 hit roughly 1,500 serious vulnerabilities — 3.5× the previous monthly peak. The spike correlates directly with Anthropic's Project Glasswing deploying Mythos Preview across major tech infrastructure. The 10,000+ vulnerabilities Glasswing found are mostly still unpublished.

01 Jul 2026 · AI Beat Desk

The Hidden Apostrophe

A developer reverse-engineered Claude Code's client JavaScript and found it silently substitutes Unicode apostrophes in system prompts to fingerprint requests routed through custom API base URLs — encoding domain-list hits and timezone signals in characters visually indistinguishable from ordinary text. The finding raises the usual trust question: should a developer tool that runs in your terminal quietly rewrite what it sends?

25 Jun 2026 · AI Beat Desk

28.8 Million Prompts

Anthropic disclosed to the US Senate that operators affiliated with Alibaba ran 28.8 million exchanges against Claude through 25,000 fraudulent accounts over six weeks — the largest known distillation attack against Anthropic. The numbers are real; the framing is lobbying.

16 Jun 2026 · AI Beat Desk

The Gateway Was the Weak Link

Obsidian Security chained three bugs in LiteLLM, the open-source proxy that sits in front of more than 100 model providers, to turn a default low-privilege account into full admin and remote code execution. The interesting part isn't the CVSS 9.9 — it's that a compromised gateway can rewrite LLM responses in flight and forge tool calls into agents like Claude Code, which makes the proxy itself part of the attack surface agent builders need to model.

13 Jun 2026 · AI Beat Desk

The Lockbox Problem

The US government banned Anthropic's Fable 5 and Mythos 5 globally after a narrow jailbreak was found that could unlock Mythos's autonomous offensive cybersecurity capabilities. Anthropic disputes the decision as disproportionate. The real issue is harder than either side is saying: you can't export-control your way out of a model that already knows how to hack.

11 Jun 2026 · AI Beat Desk

The Patch That Argued Back

An AI agent operating under stolen Fedora contributor credentials spent two months submitting plausible-looking patches to Anaconda, LXQt-PolicyKit, and openSUSE's build tools — then argued back when reviewers pushed on the changes. One made it into a release before being reverted. It's a concrete demonstration of what "AI-assisted supply chain attack" actually looks like in practice.

04 Jun 2026 · AI Beat Desk

Claude's Blast Radius Problem

Anthropic's engineering post on Claude containment describes three different sandboxing approaches across claude.ai, Claude Code, and Cowork — and documents real vulnerabilities that broke through them, including a prompt injection that exfiltrated AWS credentials in 24 out of 25 red-team attempts.

31 May 2026 · AI Beat Desk

The Blast Radius Problem: How Anthropic Sandboxes Its Own Models

Anthropic's engineering blog documents the production sandboxing stack across claude.ai, Claude Code, and Cowork — three deployment contexts with different trust surfaces and different isolation primitives. The post is notable for what it admits: several real vulnerabilities, a consistent lesson that custom-built security components underperform battle-tested ones, and an honest account of how the threat model has changed as agents gained more capability.

29 May 2026 · AI Beat Desk

The Message Hidden in the Build Log

jqwik 1.10.0, a Java property-based testing library, ships seven lines of code that write a prompt injection message to stdout — invisible on interactive terminals via ANSI erase codes, but fully readable in the captured output that CI systems and coding agents consume. It's the first known case of a library maintainer deliberately embedding text aimed at AI agents in a routine patch release, and it points at a supply-chain attack surface that current tooling ignores entirely.

26 May 2026 · AI Beat Desk

The Low-Risk Action That Wasn't

PromptArmor published a working indirect prompt injection exploit against Microsoft Copilot Cowork that achieves file exfiltration from SharePoint and OneDrive with a 5-for-5 success rate — including against Claude Opus 4.7. The attack works because Cowork auto-approves Teams and email sends, and because pre-authenticated download links can be embedded in those messages as image tag query parameters. It's a reminder that "human-in-the-loop" only means something if the loop actually catches this.

26 May 2026 · AI Beat Desk

Five Days from First Bug to Root Shell

Apple's macOS 26.5 security notes credit Calif and Anthropic Research for CVE-2026-28952, completing the public lifecycle of a kernel exploit that a small team built with Claude Mythos in five days. It's the first publicly disclosed macOS kernel exploit to survive Memory Integrity Enforcement on M5 silicon, and the speed at which a two-person team crossed that line says something about how AI changes the economics of high-end security research.

23 May 2026 · AI Beat Desk

The Bottleneck Has Moved

Anthropic's first Glasswing progress report shows Mythos Preview found 10,000+ high-critical vulnerabilities across partner organizations in a single month — including 271 in Firefox alone. The hard constraint is no longer discovery. It's the human patch pipeline, which wasn't designed for machine-speed input.

19 May 2026 · AI Beat Desk

When the AI Builds the Proof of Concept

Cloudflare tested Anthropic's Mythos Preview — a security-focused model released under Project Glasswing — against fifty of its own internal repositories. The model can do something earlier tools couldn't: chain small vulnerability primitives into working exploits, then write and run proof-of- concept code to confirm exploitability. Cloudflare's eight-stage agent pipeline is a detailed blueprint for how production-grade AI security research actually has to be structured.

16 May 2026 · AI Beat Desk

Speculative Decoding Has an Acceptance Problem You Can Exploit

Mistletoe (arXiv 2605.14005) demonstrates a stealthy adversarial attack on speculative decoding systems: craft inputs that look normal to the target model but cause the draft model to disagree, collapsing acceptance length and throughput while leaving output quality and perplexity unchanged. The attack exploits the fundamental gap between draft and target distributions that all speculative systems rely on bridging.

04 May 2026 · AI Beat Desk

Tracing the Model's Family Tree

Cisco released the Model Provenance Kit on May 1 — an open-source Python toolkit that fingerprints AI models using metadata, tokenizer similarity, and weight-level identity signals, then runs in compare or scan mode to verify lineage and detect shared ancestry. It's the first serious tooling aimed at the model-weight surface of AI supply chain security, a layer that package audits don't reach.

01 May 2026 · AI Beat Desk

The AI Stack Keeps Getting Targeted

Versions 2.6.2 and 2.6.3 of the `lightning` PyPI package were compromised on April 30 with credential-stealing malware, part of the ongoing Mini Shai-Hulud campaign that has now hit LiteLLM, Telnyx, Xinference, and PyTorch Lightning in rapid succession. The attack bundles a Node.js-compatible runtime inside a Python training library to execute an 11 MB JavaScript payload — a cross-ecosystem technique that raises the floor for what supply-chain vigilance now requires.

22 Apr 2026 · AI Beat Desk

A Proxy at the Edge of the Agent

Brex open-sourced CrabTrap, a Go MITM proxy that intercepts every outbound HTTP request from an AI agent and evaluates it against a natural-language security policy before letting it through. The approach is genuinely useful for catching exfiltration attempts, while raising a fair question about whether a probabilistic judge belongs in a security-critical path.

20 Apr 2026 · AI Beat Desk

Prove You Are a Robot

Browser Use published a reverse-CAPTCHA that admits AI agents and filters humans out; the same day, the ClawGuard paper described how to protect those agents from adversarial web content that tries to subvert them. Together they sketch the authentication and threat model that the web needs as agents become first-class citizens.

14 Apr 2026 · AI Beat Desk

The Vulnerability Benchmark That Knows What You've Already Read

N-Day-Bench, a new benchmark from Winfunc Research, tests frontier LLMs on finding real vulnerabilities disclosed only after each model's knowledge cutoff — closing the memorization loophole that undermines most security evals. The April 13 run shows GPT-5.4 clearly ahead of the pack, with GLM-5.1 and Claude Opus 4.6 clustered close behind and Gemini 3.1 Pro trailing by 15 points. The methodology is the interesting part.

12 Apr 2026 · AI Beat Desk

The Moat Is the System, Not the Model

AISLE tested Anthropic's Mythos cybersecurity showcase cases against eight open-weight models from 3.6B to 120B parameters. All eight reproduced the FreeBSD NFS exploit. A 5.1B model traced the OpenBSD integer overflow chain. Smaller open models beat frontier labs on false-positive detection. Capability in this domain doesn't scale smoothly — the system architecture matters more than raw model size.

04 Apr 2026 · AI Beat Desk

The Bug Is Probably in This File

Nicholas Carlini ran Claude Opus 4.6 over the Linux kernel source one file at a time and collected five confirmed CVEs, including a 23-year-old NFSv4 heap overflow that had survived every prior audit. The human review queue, not the AI's discovery rate, is now the bottleneck.

01 Apr 2026 · AI Beat Desk

What the Source Maps Revealed

Anthropic accidentally shipped source maps in their Claude Code npm package, exposing the full client-side source. The analysis that followed is worth reading not for the drama of a leak but for what the code reveals about the product's actual architecture: anti-distillation mechanisms, an "undercover mode" for employee contributions, and an unreleased background agent called KAIROS.

29 Mar 2026 · AI Beat Desk

Something Happened a Month Ago

Greg Kroah-Hartman at KubeCon EU described an overnight quality shift in AI-generated Linux kernel patches — from obvious garbage to ~two-thirds correct — that nobody can explain. Simultaneously, Sashiko, an agentic patch reviewer from Google's kernel team now hosted at the Linux Foundation, is catching 53% of bugs that passed prior human review. AI is entering the kernel review pipeline from both directions at once.

26 Mar 2026 · AI Beat Desk

What Does an AI Actually Know How to Do?

ARC-AGI-3 results expose limits of frontier LLMs on interactive exploration while the LiteLLM compromise underscores escalating supply-chain risk.