The Munich I Regional Court has delivered a sweeping victory for GEMA, Germany’s music rights organization, in its lawsuit against OpenAI, the operator of ChatGPT. The court substantially granted GEMA’s claims for injunctive relief, disclosure of information, and damages. While significant as OpenAI’s first major legal defeat, the ruling is even more momentous for finding copyright infringement in both the generative AI’s training and output. Despite being a district court decision, its impact should not be taken lightly.

Note: This article is an English translation of a post originally written in Japanese. While it assumes a Japanese reader, I believe it may also be useful for an English-speaking audience.

This lawsuit did not target mass data scraping in the abstract. Instead, its subject matter was narrowly focused on nine well-known German songs, testing whether their lyrics would be reproduced as output from simple prompts. This presents a vastly different landscape from the often-vague US cases that dispute indirect infringement during the AI training process. In my opinion, this strategy is highly effective in a legal jurisdiction that lacks a strong discovery system, as it secures “smoking gun” evidence through the output itself.

Regarding the court’s judgment, first, it affirmed an infringement of the right of reproduction at the model’s training stage. Based on its finding that the lyrics were “reproducibly contained,” the court certified that the lyrics were “embodied” as “data within the specified parameters” of the model. In essence, this is a judgment that the model’s weights themselves can legally constitute a “reproduction.” This is an extremely strict finding from the perspective of AI developers. In response, OpenAI apparently deployed its conventional argument that the model contains only “abstract patterns,” but the court recognized the exact identicality of the output as proof of “memorization” within the model. It also rejected OpenAI’s claim of “new generation,” ruling out “coincidence” by taking into account the length and complexity of the lyrics.

The court didn’t stop at the training phase; it also found a separate infringement of the right of making available to the public in the output. In other words, the very act of OpenAI making the lyrics accessible to users (the public) on demand via ChatGPT constitutes an infringement. My personal takeaway is that this dual finding of infringement–at both the training and output stages–is incredibly severe. Should OpenAI appeal this decision, it must succeed in overturning both the reproduction infringement (in the model) and the making available infringement (in the output). As the court’s logic appears reasonably solid, overturning the entire judgment will be a difficult task.

As this is within the EU, OpenAI naturally argued that its actions were covered by the TDM (Text and Data Mining) exception under the EU DSM Directive. This argument collapsed on the facts, as the court found that GEMA had indeed implemented a valid opt-out. However, in a more groundbreaking move, the court ruled even before considering the opt-out that the act of “memorizing” works and later “outputting” them–as opposed to merely “analyzing” them–fundamentally deviates from the purpose protected by the TDM exception.

However, the Munich court’s decision not to refer this matter to the Court of Justice of the European Union (CJEU)–despite it touching upon a fundamental interpretation of the EU DSM Directive–may become an issue on appeal. The district court’s logic appears to be that OpenAI’s actions fall so clearly outside the scope of the TDM exception’s purpose that there is no ambiguity requiring a referral. In my view, however, this constitutes an unprecedented interpretation of EU law, and the question should have been referred to the CJEU for a preliminary ruling.

Furthermore, the court rejected OpenAI’s argument that responsibility for the output lies with the user who writes the prompt. The court found OpenAI responsible for selecting the training data, designing the model architecture, and the resulting memorization. The user’s prompt is merely a “trigger”, not the “cause”. This particular point was a relatively predictable outcome.

Looking ahead, OpenAI will undoubtedly appeal. The argument treating the AI model itself as a “legal reproduction” still feels vulnerable. The Getty v. Stability AI case in the UK recently leaned toward viewing the model as a “stochastic synthesis,” so it will be easy for OpenAI to point out a discrepancy with emerging international precedent. Nonetheless, the fact that a collecting society advocating for comprehensive licensing models has achieved such a clean victory–even at a district level–is a major development. The notion that data on the web is free training material is facing an increasingly difficult reception, at least within the EU.

Regarding jurisdiction, the court consistently applied German and EU law based on the finding that the infringing act or its results occurred in Germany. While justifying jurisdiction for the “making available” infringement based on the “place of harm” (where the output is received) is plausible, the “reproduction” infringement is more complex. The actual act of reproduction during training by OpenAI likely occurred entirely in the US, which may become a point of contention on appeal. According to the judgment text, the court found that OpenAI “provides” the model on servers within Germany and the EEA, and the Munich court likely used this–the storage of the model (the infringing object) or its results within the EU–as the basis for its jurisdiction. However, the legal evaluation of training-stage reproduction that occurred exclusively in the US remains a potential issue for the future.

However, if this interpretation is generalized, non-EU AI providers will face constant risk when offering services within the EU. This is not as simple as blocking a website; if this logic holds, it could escalate to blocking the models themselves or their outputs from entering Europe. Conversely, this could amplify the sentiment among EU citizens that their data is being unfairly exploited by non-EU AI companies, potentially becoming a future political flashpoint.

To return from that digression, I personally feel this Munich ruling may be slightly too aggressive. The issues of jurisdiction, the interpretation of the TDM exception, and the “memorization as reproduction” theory will almost certainly be re-litigated on appeal or, eventually, at the CJEU. The act of reproduction (training) was technically completed in the US; the implications of “bringing” that model into Europe may involve political judgment, not just judicial. This could trigger political moves beyond the judiciary. If this escalates, we might see the logic of strong regulations like the GDPR or DSA strengthen in the AI domain.

Looking at the case specifics, GEMA strategically selected nine famous German songs, used simple prompts to generate outputs, and secured an infringement finding based on outputs as short as 15 or 25 words. Selecting these short but highly creative excerpts was clearly a key factor in GEMA’s victory. This technique–using simple prompts to generate “dead copies” of highly original parts of a work and using that as evidence–is a strategy that turns the lack of a strong discovery system into an advantage. It may well inspire similar lawsuits in other jurisdictions. Ultimately, this case was decided because verbatim (or near-verbatim) copies of the lyrics appeared in the output, from which all other infringements were inferred. To use Japanese legal terminology, this would likely be deemed a case that does not fall under “non-enjoyment use” (非享受利用) and is therefore outside the exception for information analysis (Article 30-4 of the Copyright Act). Although legal approaches differ, the focus on the model’s output and its purpose seems to be a common thread emerging globally. In that sense, one might say the “non-enjoyment purpose” concept in Japanese copyright law was, in fact, ahead of its time.


This image is a reward for those who read the difficult article.

The Current State of the Theory that GPL Propagates to AI Models Trained on GPL Code

When GitHub Copilot was launched in 2021, the fact that its training data included a vast amount of Open Source code publicly available on GitHub attracted significant attention, sparking lively debates regarding licensing. While there were issues concerning conditions such as attribution required by most licenses, there was a particularly high volume of discourse suggesting…

The Legal Hack: Why U.S. Law Sees Open Source as “Permission,” Not a Contract

In Japan, the common view is to treat an Open Source license as a license agreement, or a contract. This is also the case in the EU. However, in the United States—the origin point for almost every aspect of Open Source—an Open Source license has long been considered not a contract, but a “unilateral permission”…

Evaluating OpenMDW: A Revolution for Open AI, or a License to Openwash?

Although the number of AI models distributed under Open Source licenses is increasing, it can be said that AI systems in which all related components, including training data, are open are still in a developmental stage, even as a few promising systems have emerged. In this context, this past May, the Linux Foundation, in collaboration…

A Curious Phenomenon with Gemma Model Outputs and License Propagation

While examining the licensing details of Google’s Gemma model, I noticed a potentially puzzling phenomenon: you can freely assign a license to the model’s outputs, yet depending on how those outputs are used, the original Terms of Use might suddenly propagate to the resulting work. Outputs vs. Model Derivatives The Gemma Terms of Use distinguish…

Should ‘Open Source AI’ Mean Exposing All Training Data?

DeepSeek has had a major global impact. This appears to stem not only from the emergence of a new force in China that threatens the dominance of major U.S. AI vendors, but also from the fact that the AI model itself is being distributed under the MIT License, which is an Open Source license. Nevertheless,…

Significant Risks in Using AI Models Governed by the Llama License

Although it has already been explained that the Llama model and the Llama License (Llama Community License Agreement) do not, in any sense, qualify as Open Source, it bears noting that the Llama License contains several additional issues. While not directly relevant to whether it meets Open Source criteria, these provisions may nonetheless cause the…

The Hidden Traps in Meta’s Llama License

— An Explanation of Llama’s Supposed “Open Source” Status and the Serious Risks of Using Models under the Llama License — It is widely recognized—despite Meta’s CEO persistently promoting the notion that “Llama is Open Source”—that the Llama License is in fact not Open Source. Yet few individuals have clearly articulated the precise reasons why…