Coding · AI Beat

24 Apr 2026 · AI Beat Desk

Dense Beats Sparse, and Thinking Persists

A week after Qwen3.6-35B-A3B showed that hybrid linear attention fits frontier-level coding into 3B active parameters, Alibaba's Qwen team shipped a second variant: a fully dense 27B model that trades the MoE efficiency gains for higher peak accuracy, hitting 77.2% on SWE-bench Verified and adding thinking preservation — a mechanism to keep chain-of-thought traces across multi-turn agent conversations.

17 Apr 2026 · AI Beat Desk

Qwen3.6 Fits in a Laptop and Ships a Novel Architecture

Qwen3.6-35B-A3B landed on April 16 under Apache 2.0 — 35 billion total parameters, 3 billion active per token, and a hybrid architecture that alternates Gated DeltaNet linear attention with standard attention blocks. It runs on a laptop, scores 73.4 on SWE-bench Verified, and the architecture is more interesting than the benchmark numbers alone suggest.

13 Apr 2026 · AI Beat Desk

The Model That Rewrote Its Own Scaffold

MiniMax open-sourced M2.7, a 229B sparse MoE model for coding and agentic work. The interesting part isn't the benchmarks — it's the self-evolution loop: an internal M2.7 instance ran 100+ rounds autonomously modifying its own programming scaffold, keeping what worked and reverting what didn't, and came out 30% better with no per-step human direction. That's a different kind of claim than standard RL post-training.

10 Apr 2026 · AI Beat Desk

Read First, Then Code

SkyPilot published an experiment where giving Claude Code research papers to read before it optimized llama.cpp's CPU backend yielded 15% faster text generation on x86 for about $29. The interesting part isn't the speedup — it's that the literature revealed operator fusions that simply don't exist in source code, and a code-only agent had no way to find them.

10 Apr 2026 · AI Beat Desk

Eight Hours in the Shell

Z.AI released GLM-5.1, a 754B MoE open-weight model under MIT license designed for autonomous coding sessions lasting up to 8 hours. The "8-hour window" is explicitly a training objective — sustained goal-directed behavior through thousands of tool calls — not just a context-length claim. It claims the top spot on SWE-Bench Pro with a score of 58.4, ahead of GPT-5.4 and Claude Opus 4.6.

06 Apr 2026 · AI Beat Desk

AI Wins Live Codeforces Rounds, Three in a Row

A preprint from the DeepReinforce Team claims their GrandCode system placed first in three consecutive live Codeforces rounds in March 2026, defeating all human participants. The technical contribution is Agentic GRPO, a multi-stage RL algorithm designed for agent pipelines where reward signals arrive late and off-policy drift is severe. Take the claim seriously, but verify the details before the hype cycle arrives.

03 Apr 2026 · AI Beat Desk

The IDE Learns to Delegate

Cursor 3, released April 2, reframes the IDE as a multi-agent orchestration platform. Parallel agents initiated from mobile, Slack, GitHub, and Linear all surface in a unified sidebar. Cursor is also shipping Composer 2, an in-house frontier coding model. The shift is from "AI assistant inside an editor" to "editor inside an agent coordination system."