VISIVE.AI

U.S. Court Ruling on AI and Copyright: Transformative Use vs. Piracy

Judge William Alsup's ruling in Bartz v Anthropic PBC sets a clear line between lawful and unlawful use of copyrighted materials in AI training.

Jul 01, 2025Source: Visive.ai
U.S. Court Ruling on AI and Copyright: Transformative Use vs. Piracy

Judge William Alsup’s decision in Bartz v Anthropic PBC, in the United States District Court for the Northern District of California, now stands as one of the most significant U.S. rulings on whether artificial intelligence developers can lawfully use copyright-protected works to train their models.

In a detailed judgment, the Court dissected the mechanics of AI training, treating each stage of use on its own merits. The dispute centered on Anthropic’s method of sourcing books to train its Claude model, which involved a mix of legitimately purchased physical books and millions of others downloaded from pirate sites such as Books3 and LibGen.

When three authors claimed their works were used without permission, Anthropic moved for summary judgment, arguing that every element of its data pipeline qualified as fair use under Section 107 of the U.S. Copyright Act. The Court delivered a split verdict, drawing sharp lines: training a language model on lawfully obtained books was deemed “spectacularly transformative” and protected, while acquiring and hoarding pirated copies could not be excused under any fair use defense.

The Court embraced the principle that using copyright works for training is fundamentally transformative when the material is lawfully sourced, likening the statistical modeling process to a human’s ability to read widely and write anew. Although critics argue that this analogy oversimplifies technical realities, the Court held that the mere potential for fragmentary reproduction did not disqualify the training process from fair use protection without evidence of infringing output.

Of note is the Court’s acceptance of the claim that a model might contain “compressed copies” of its training data, a notion many experts reject, given that large language models store statistical relationships between tokens, not miniature libraries of digital books. While the ruling’s fair use outcome did not hinge on this technical characterisation, its persistence in the record risks sowing confusion in future disputes.

The Court was unyielding in its condemnation of pirated training inputs. By downloading millions of books from well-known pirate sites and storing them as part of its training corpus, Anthropic could not shield its conduct behind a transformative purpose. Fair use, the Court made plain, does not wipe away the consequences of unlawful acquisition, nor does it entitle an AI developer to sidestep legitimate markets in the name of innovation. This ruling sends a clear message to the industry: no matter how defensible the output is, an AI system’s legitimacy can be fatally undermined if its inputs are tainted at the source.

Anthropic did manage to salvage its defence with respect to scanning its own lawfully purchased books, which the Court accepted as fair use because the scanning served purely internal, practical functions and did not create new market substitutes. This judgment reinforces that format-shifting for internal efficiencies, when carefully circumscribed, can remain lawful under U.S. copyright doctrine.

The ruling also highlights an emerging tension with European law. Under the EU’s Copyright in the Digital Single Market Directive, rightsholders can reserve rights against text and data mining for commercial purposes. If a U.S.-based company circumvents such reservations by scraping European works or sidestepping opt-outs under Article 4, American courts could potentially treat that conduct as functionally equivalent to piracy. The judgment suggests that acquiring material in breach of non-U.S. rights reservations, even if the end use is technically transformative, could taint the entire process under U.S. law because the decisive test turns on how the input was sourced, not just how it is processed.

With this decision, the Court plants a marker for what U.S. courts may and may not permit in an era of expansive AI development. While far from the final word on AI and copyright, Bartz v Anthropic offers the clearest sign yet that companies must be prepared to defend not only what their models can do but also to prove precisely how and from where their training data was acquired. This evidential burden is likely to shape how datasets are licensed, documented, and litigated in the years to come.

Frequently Asked Questions

What is the main takeaway from the Bartz v Anthropic PBC ruling?

The ruling establishes that using lawfully obtained copyrighted materials for AI training is transformative and protected under fair use, while using pirated materials is not.

How does the Court define transformative use in this context?

The Court defines transformative use as using copyrighted materials in a fundamentally new way, such as training AI models, which is likened to a human's ability to read widely and write anew.

What is the significance of the 'compressed copies' claim in the ruling?

The claim that models might contain 'compressed copies' of training data has been criticized by experts, but its inclusion in the ruling may cause confusion in future disputes.

How does the ruling impact the use of pirated materials in AI training?

The ruling clearly states that acquiring and using pirated materials for AI training cannot be excused under any fair use defense and is considered unlawful.

What are the implications of this ruling for cross-border data acquisition?

The ruling suggests that acquiring material in breach of non-U.S. rights reservations, even if the end use is transformative, could be treated as piracy under U.S. law.

Related News Articles

Image for Brookhaven Lab Unveils VISION: AI Assistant for Scientists

Brookhaven Lab Unveils VISION: AI Assistant for Scientists

Read Article →
Image for 20 Tech Companies Leading the AI Revolution

20 Tech Companies Leading the AI Revolution

Read Article →
Image for AI Adoption Lags, Costing Companies Billions in Missed Opportunities

AI Adoption Lags, Costing Companies Billions in Missed Opportunities

Read Article →
Image for AI's Role in NHS Communications: Key Findings and Strategic Priorities

AI's Role in NHS Communications: Key Findings and Strategic Priorities

Read Article →
Image for IFC Invests $25 Million in Blume Ventures' New $300 Million Fund

IFC Invests $25 Million in Blume Ventures' New $300 Million Fund

Read Article →
Image for AI and Robotics Labs to Empower 50 Government Schools in Vizag

AI and Robotics Labs to Empower 50 Government Schools in Vizag

Read Article →