Copilot Signs the Commit Whether You Asked It To or Not

VS Code 1.118, released April 29, silently turned on automatic Copilot co-authorship for git commits by changing git.addAICoAuthor from "off" to "all" by default. The feature has bugs — it fires even when AI features are disabled — and has already stamped 4M+ GitHub commits with a non-human co-author, surfacing awkward questions about copyright ownership that the US Copyright Office has already answered.

Read more →

Finetuning Unlocks the Books That Were Always There

A paper from Columbia and UW shows that finetuning frontier models on plot-summary expansions — no actual book text in training — triggers verbatim recall of 85–90% of held-out copyrighted novels. The result generalizes across authors and across providers, and directly challenges the argument that safety alignment serves as adequate copyright protection.

Read more →

What You Get When You Only Train on Public Domain Text

Mr. Chatterbox is a 340M-parameter model trained exclusively on 28,000 Victorian-era texts from the British Library — definitively public domain, zero copyright exposure. Simon Willison's writeup documents both what it proves and what it falls short of: the corpus is large enough to train something coherent, but not large enough to be useful by Chinchilla norms.

Read more →