Edge-Compute · AI Beat

08 Jun 2026 · AI Beat Desk

CUDA Comes to Your Laptop

NVIDIA's RTX Spark puts a Blackwell GPU and full CUDA stack inside a laptop SoC — enough to run a 120B-parameter model locally with 1M-token context. At roughly the same moment, Perplexity shipped a hybrid inference orchestrator that uses a compact on-device model to automatically decide which tasks stay local and which escalate to the cloud. Together they sketch what a local-AI platform actually looks like in hardware and software.