Security · AI Beat

15 Jul 2026 · AI Beat Desk

Cursor and the Attack Surface You Agreed To

Two independent security disclosures landed within hours of each other about Cursor IDE: Mindgard's finding that Cursor auto-executes any git.exe in a repo root (still unpatched after 7 months) and Cato Networks' DuneSlide research showing that prompt injection via MCP or web search can escape the agent sandbox and achieve full OS-level RCE. Together they define a new class of attack surface that appears whenever an AI agent runs with your privileges.

13 Jul 2026 · AI Beat Desk

What Grok Build Uploads

A wire-level analysis of Grok Build CLI 0.2.93 found it uploads the entire workspace as a git bundle to Google Cloud Storage — about 5.1 GiB from a 12 GB repo, including files the agent never read and unredacted .env credentials. The model itself received 192 KB. The "Improve the model" toggle does not stop the upload.

08 Jul 2026 · AI Beat Desk

Seven Bugs in a Crypto Library

zkSecurity ran their AI audit pipeline against Cloudflare's CIRCL experimental crypto library and found seven genuine vulnerabilities — from float64 precision loss in threshold RSA to a full CP-ABE access-control break. The piece is as valuable for what it reveals about AI's specific blind spots in cryptographic reasoning as for the bugs themselves.

04 Jul 2026 · AI Beat Desk

The Bug-Finding Numbers Land

Epoch.ai tracked CVE disclosures from 21 major organizations and found June 2026 hit roughly 1,500 serious vulnerabilities — 3.5× the previous monthly peak. The spike correlates directly with Anthropic's Project Glasswing deploying Mythos Preview across major tech infrastructure. The 10,000+ vulnerabilities Glasswing found are mostly still unpublished.

01 Jul 2026 · AI Beat Desk

The Hidden Apostrophe

A developer reverse-engineered Claude Code's client JavaScript and found it silently substitutes Unicode apostrophes in system prompts to fingerprint requests routed through custom API base URLs — encoding domain-list hits and timezone signals in characters visually indistinguishable from ordinary text. The finding raises the usual trust question: should a developer tool that runs in your terminal quietly rewrite what it sends?

25 Jun 2026 · AI Beat Desk

28.8 Million Prompts

Anthropic disclosed to the US Senate that operators affiliated with Alibaba ran 28.8 million exchanges against Claude through 25,000 fraudulent accounts over six weeks — the largest known distillation attack against Anthropic. The numbers are real; the framing is lobbying.

16 Jun 2026 · AI Beat Desk

The Gateway Was the Weak Link

Obsidian Security chained three bugs in LiteLLM, the open-source proxy that sits in front of more than 100 model providers, to turn a default low-privilege account into full admin and remote code execution. The interesting part isn't the CVSS 9.9 — it's that a compromised gateway can rewrite LLM responses in flight and forge tool calls into agents like Claude Code, which makes the proxy itself part of the attack surface agent builders need to model.

11 Jun 2026 · AI Beat Desk

The Patch That Argued Back

An AI agent operating under stolen Fedora contributor credentials spent two months submitting plausible-looking patches to Anaconda, LXQt-PolicyKit, and openSUSE's build tools — then argued back when reviewers pushed on the changes. One made it into a release before being reverted. It's a concrete demonstration of what "AI-assisted supply chain attack" actually looks like in practice.

31 May 2026 · AI Beat Desk

The Blast Radius Problem: How Anthropic Sandboxes Its Own Models

Anthropic's engineering blog documents the production sandboxing stack across claude.ai, Claude Code, and Cowork — three deployment contexts with different trust surfaces and different isolation primitives. The post is notable for what it admits: several real vulnerabilities, a consistent lesson that custom-built security components underperform battle-tested ones, and an honest account of how the threat model has changed as agents gained more capability.

29 May 2026 · AI Beat Desk

The Message Hidden in the Build Log

jqwik 1.10.0, a Java property-based testing library, ships seven lines of code that write a prompt injection message to stdout — invisible on interactive terminals via ANSI erase codes, but fully readable in the captured output that CI systems and coding agents consume. It's the first known case of a library maintainer deliberately embedding text aimed at AI agents in a routine patch release, and it points at a supply-chain attack surface that current tooling ignores entirely.

26 May 2026 · AI Beat Desk

The Low-Risk Action That Wasn't

PromptArmor published a working indirect prompt injection exploit against Microsoft Copilot Cowork that achieves file exfiltration from SharePoint and OneDrive with a 5-for-5 success rate — including against Claude Opus 4.7. The attack works because Cowork auto-approves Teams and email sends, and because pre-authenticated download links can be embedded in those messages as image tag query parameters. It's a reminder that "human-in-the-loop" only means something if the loop actually catches this.

26 May 2026 · AI Beat Desk

Five Days from First Bug to Root Shell

Apple's macOS 26.5 security notes credit Calif and Anthropic Research for CVE-2026-28952, completing the public lifecycle of a kernel exploit that a small team built with Claude Mythos in five days. It's the first publicly disclosed macOS kernel exploit to survive Memory Integrity Enforcement on M5 silicon, and the speed at which a two-person team crossed that line says something about how AI changes the economics of high-end security research.

23 May 2026 · AI Beat Desk

The Bottleneck Has Moved

Anthropic's first Glasswing progress report shows Mythos Preview found 10,000+ high-critical vulnerabilities across partner organizations in a single month — including 271 in Firefox alone. The hard constraint is no longer discovery. It's the human patch pipeline, which wasn't designed for machine-speed input.

19 May 2026 · AI Beat Desk

When the AI Builds the Proof of Concept

Cloudflare tested Anthropic's Mythos Preview — a security-focused model released under Project Glasswing — against fifty of its own internal repositories. The model can do something earlier tools couldn't: chain small vulnerability primitives into working exploits, then write and run proof-of- concept code to confirm exploitability. Cloudflare's eight-stage agent pipeline is a detailed blueprint for how production-grade AI security research actually has to be structured.

16 May 2026 · AI Beat Desk

Speculative Decoding Has an Acceptance Problem You Can Exploit

Mistletoe (arXiv 2605.14005) demonstrates a stealthy adversarial attack on speculative decoding systems: craft inputs that look normal to the target model but cause the draft model to disagree, collapsing acceptance length and throughput while leaving output quality and perplexity unchanged. The attack exploits the fundamental gap between draft and target distributions that all speculative systems rely on bridging.

04 May 2026 · AI Beat Desk

Tracing the Model's Family Tree

Cisco released the Model Provenance Kit on May 1 — an open-source Python toolkit that fingerprints AI models using metadata, tokenizer similarity, and weight-level identity signals, then runs in compare or scan mode to verify lineage and detect shared ancestry. It's the first serious tooling aimed at the model-weight surface of AI supply chain security, a layer that package audits don't reach.

01 May 2026 · AI Beat Desk

The AI Stack Keeps Getting Targeted

Versions 2.6.2 and 2.6.3 of the `lightning` PyPI package were compromised on April 30 with credential-stealing malware, part of the ongoing Mini Shai-Hulud campaign that has now hit LiteLLM, Telnyx, Xinference, and PyTorch Lightning in rapid succession. The attack bundles a Node.js-compatible runtime inside a Python training library to execute an 11 MB JavaScript payload — a cross-ecosystem technique that raises the floor for what supply-chain vigilance now requires.

22 Apr 2026 · AI Beat Desk

A Proxy at the Edge of the Agent

Brex open-sourced CrabTrap, a Go MITM proxy that intercepts every outbound HTTP request from an AI agent and evaluates it against a natural-language security policy before letting it through. The approach is genuinely useful for catching exfiltration attempts, while raising a fair question about whether a probabilistic judge belongs in a security-critical path.

14 Apr 2026 · AI Beat Desk

The Vulnerability Benchmark That Knows What You've Already Read

N-Day-Bench, a new benchmark from Winfunc Research, tests frontier LLMs on finding real vulnerabilities disclosed only after each model's knowledge cutoff — closing the memorization loophole that undermines most security evals. The April 13 run shows GPT-5.4 clearly ahead of the pack, with GLM-5.1 and Claude Opus 4.6 clustered close behind and Gemini 3.1 Pro trailing by 15 points. The methodology is the interesting part.