The Integral Shortcut Through Diffusion Space

Sander Dieleman's post on flow maps frames diffusion model distillation as learning to compute the integral of the velocity field directly, rather than stepping along tangent directions. The reformulation unifies 20+ recent papers under three consistency constraints and explains why single-step sampling is achievable without sacrificing bijectivity.

Read more →

Generation Is Pretraining, in Vision Too

Google DeepMind's Vision Banana paper shows that training a model to generate images — and only that — produces transferable visual representations strong enough to beat specialized discriminative models on segmentation and metric depth estimation when lightly instruction-tuned. The finding is the visual analog of how LLM pretraining generalizes across language tasks.

Read more →

VOID: Remove the Object, Rewrite the Physics

Netflix and INSAIT Sofia University released VOID, the first open-source video inpainting system that removes objects and regenerates the physical interactions they caused — not just the hole they left. It's Netflix's first public AI model release, built on a novel quadmask encoding and CogVideoX, under Apache 2.0.

Read more →