US Judge Rules AI Training on Books Not a Copyright Violation
A US judge has ruled that using books to train AI software is not a violation of US copyright law, but Anthropic must face trial over pirated copies.
A US judge has ruled that using books to train artificial intelligence (AI) software is not a violation of US copyright law. The decision stems from a lawsuit brought against AI firm Anthropic by three authors, including best-selling mystery thriller writer Andrea Bartz.
The authors accused Anthropic of stealing their work to train its Claude AI model and build a multi-billion dollar business. Judge William Alsup, however, found that Anthropic's use of the authors' books was 'exceedingly transformative' and allowed under US law.
Despite this ruling, Judge Alsup rejected Anthropic's request to dismiss the case entirely. The firm will still have to stand trial over its use of pirated copies to build its library of material. Anthropic, backed by Amazon and Google's parent company Alphabet, could face up to $150,000 in damages per copyrighted work.
The firm holds more than seven million pirated books in a 'central library' according to the judge. The ruling is among the first to address a critical question in the AI industry: how Large Language Models (LLMs) can legitimately learn from existing material.
'Like any reader aspiring to be a writer, Anthropic's LLMs trained upon works, not to race ahead and replicate or supplant them — but to turn a hard corner and create something different,' Judge Alsup wrote. 'If this training process reasonably required making copies within the LLM or otherwise, those copies were engaged in a transformative use.'
The authors did not claim that the training led to 'infringing knockoffs' with replicas of their works being generated for users of the Claude tool. If they had, Judge Alsup noted, 'this would be a different case.'
Similar legal battles have emerged over the AI industry's use of other media and content, from journalistic articles to music and video. This month, Disney and Universal filed a lawsuit against AI image generator Midjourney, accusing it of piracy. The BBC is also considering legal action over the unauthorized use of its content.
In response to these legal battles, some AI companies have struck deals with creators of the original materials or their publishers to license content for use. Judge Alsup's ruling on Anthropic's 'fair use' defense paves the way for future legal judgments.
However, the judge ruled that Anthropic had violated the authors' rights by saving pirated copies of their books as part of a 'central library of all the books in the world.' In a statement, Anthropic expressed satisfaction with the judge's recognition of its transformative use but disagreed with the decision to hold a trial about how some books were obtained and used. The company remains confident in its case and is evaluating its options.
A lawyer for the authors declined to comment.
Frequently Asked Questions
What did the US judge rule about using books to train AI?
The judge ruled that using books to train AI software is not a violation of US copyright law, as it is considered transformative use.
Who brought the lawsuit against Anthropic?
The lawsuit was brought by three authors, including best-selling mystery thriller writer Andrea Bartz.
What did Judge Alsup mean by 'transformative use'?
Transformative use means that the AI training process created something new and different from the original works, rather than simply replicating them.
What are the potential damages Anthropic could face?
Anthropic could face up to $150,000 in damages per copyrighted work, according to the ruling.
What other legal battles are emerging in the AI industry?
Similar legal battles have emerged over the AI industry's use of other media and content, including journalistic articles, music, and video.