Fifty Nanoseconds to Decide

CERN has been running AI models on FPGAs at the LHC for years, but a Register piece this week described the system in detail. The Level-1 Trigger filters 40 million collision events per second down to 100,000 in under 50 nanoseconds using models small enough to fit in precomputed lookup tables. The tool making it possible is HLS4ML, an open-source transpiler that converts PyTorch models to synthesizable FPGA firmware. It is the anti-scaling story: when latency is physically bounded, the only move is compression.

Read more →

Arm Bets the Model

Arm's first production AI CPU, Google's TurboQuant, and Hypura's NVMe-first runtime converge on memory bandwidth as the core inference bottleneck.

Read more →