Last month, the U.S. Copyright Office released a study report on the fair-use implications of using copyrighted works without permission to train AI systems. As is widely known, the Register of Copyrights was then abruptly—and somewhat mysteriously—removed from office.

https://www.copyright.gov/ai/

Around an agency that normally attracts little public attention, an outsized political drama has erupted. Some news outlets even suggested the Office had shifted toward rejecting fair-use defenses for AI training. Personally, I do not find the report’s conclusions shocking; they strike me as a reasonable landing point for an eventual infringement analysis. In the United States many people seem to believe, perhaps naively, that once a use is deemed fair it remains fair forever. Given the vast sums now invested in AI, a report that questions blanket fair-use assumptions was bound to feel jarring to those who regarded the doctrine as an all-purpose shield.

The report’s through-line is clear: “training a generative AI foundation model on a large and diverse dataset will often be transformative” (p. 45). Because “transformative” use departs from the original expressive intent, it is less likely to harm the work’s market value and therefore weighs in favor of fair use. Yet the report also cautions that “where a model may generate expressive content or reproduce copyrighted expression, training-purpose use cannot be deemed ‘non-expressive’ ”. In other words, claims that AI training is nothing more than a technical process—or that it is simply “learning” like a human—do not automatically confer fair-use status. The Office’s position is that fair-use analysis for training must consider not just the technical steps but the purpose and the eventual model’s impact on the markets for the underlying works, including the ability to mimic an author’s style.

Although the study is impressively researched, I feel the Office overreaches when it assesses the market effects that AI models might have. This relates to the fourth fair-use factor: “the effect of the use upon the potential market for or value of the copyrighted work.” The Office signals that virtually any “effect” on a potential market should be counted. That includes the possibility that a model could generate material similar in content or style to works in the training data, thereby depressing sales, diluting the market, or undermining future licensing opportunities—even though copyright law traditionally does not protect “style.” Because the empirical record and case law are still thin, I believe the Office has drawn premature conclusions here.

On the first fair-use factor, the report leans heavily on lessons from Andy Warhol Foundation v. Goldsmith, making transformativeness the analytical core. Google Books is certainly important in showing that massive copying can be permissible when tied to a transformative purpose and when the final output matters, but Warhol probed more deeply into the nature of transformativeness itself and how it differs from the derivative-works right. That ruling therefore offers more direct guidance for evaluating AI training and underscores the need for a careful, fact-specific copyright inquiry going forward.

For the record, I drafted these comments only a few days after the report’s release but left them in my drafts folder when the unseemly political drama unfolded. I have now touched them up slightly and made them public as part of cleaning out that folder.

The Hidden Risks of NVIDIA’s Open Model License

Recently, regarding the open-weights AI model “Nemotron 3” released by NVIDIA, there are scattered media reports mistakenly describing it as open source. Because there is concern that these reports encourage ignoring the usage risks of the NVIDIA Open Model License Agreement (version dated October 24, 2025; hereinafter referred to as the NVIDIA License), which is…

The Current State of the Theory that GPL Propagates to AI Models Trained on GPL Code

When GitHub Copilot was launched in 2021, the fact that its training data included a vast amount of Open Source code publicly available on GitHub attracted significant attention, sparking lively debates regarding licensing. While there were issues concerning conditions such as attribution required by most licenses, there was a particularly high volume of discourse suggesting…

The Legal Hack: Why U.S. Law Sees Open Source as “Permission,” Not a Contract

In Japan, the common view is to treat an Open Source license as a license agreement, or a contract. This is also the case in the EU. However, in the United States—the origin point for almost every aspect of Open Source—an Open Source license has long been considered not a contract, but a “unilateral permission”…

Evaluating OpenMDW: A Revolution for Open AI, or a License to Openwash?

Although the number of AI models distributed under Open Source licenses is increasing, it can be said that AI systems in which all related components, including training data, are open are still in a developmental stage, even as a few promising systems have emerged. In this context, this past May, the Linux Foundation, in collaboration…

A Curious Phenomenon with Gemma Model Outputs and License Propagation

While examining the licensing details of Google’s Gemma model, I noticed a potentially puzzling phenomenon: you can freely assign a license to the model’s outputs, yet depending on how those outputs are used, the original Terms of Use might suddenly propagate to the resulting work. Outputs vs. Model Derivatives The Gemma Terms of Use distinguish…

Should ‘Open Source AI’ Mean Exposing All Training Data?

DeepSeek has had a major global impact. This appears to stem not only from the emergence of a new force in China that threatens the dominance of major U.S. AI vendors, but also from the fact that the AI model itself is being distributed under the MIT License, which is an Open Source license. Nevertheless,…

Significant Risks in Using AI Models Governed by the Llama License

Although it has already been explained that the Llama model and the Llama License (Llama Community License Agreement) do not, in any sense, qualify as Open Source, it bears noting that the Llama License contains several additional issues. While not directly relevant to whether it meets Open Source criteria, these provisions may nonetheless cause the…

The Hidden Traps in Meta’s Llama License

— An Explanation of Llama’s Supposed “Open Source” Status and the Serious Risks of Using Models under the Llama License — It is widely recognized—despite Meta’s CEO persistently promoting the notion that “Llama is Open Source”—that the Llama License is in fact not Open Source. Yet few individuals have clearly articulated the precise reasons why…