Three years ago, OSI began the process of creating the Open Source AI Definition (OSAID). I was involved in this long journey from the beginning as a volunteer, but at the time my strongest concern was simply that we must not undermine the values of the original Open Source community, and I did not think OSAID itself would come to carry any major significance of its own. But now I can say this clearly: Open Source AI as defined by OSAID brings “AI sovereignty” to people and nations around the world.

By AI sovereignty here, I mean first and foremost that people around the world must continue to hold agency over their own future. The sovereignty of states and organizations is derivative of that, and where the two come into conflict, the freedom of people should take precedence. And what truly matters in the coming age of AI is not simply that highly capable models exist somewhere, but that countries and organizations around the world can use, study, modify, and share them on their own terms. These are the very four freedoms that have long been central to Open Source, and their importance for AI lies precisely in ensuring that they exist in reality. It is not enough that a model can be run free of charge, or that its weights happen to be disclosed for a time. Only when the terms do not later change, when access is not closed off at the convenience of some particular actor, and when the system can be sustained independently if necessary, can it truly serve as a foundation for society.

This certainty is not an idealistic concern. It is directly tied to both economics and security. Any AI whose continued availability depends on the goodwill or business strategy of a particular company is not sovereign, no matter how useful it may be. In such a situation, price, functionality, and terms of use are all ultimately subject to someone else’s judgment. Open Source AI, by contrast, leaves society with the minimum conditions under which any country or organization, if it commits the necessary investment of capital, time, and effort, can reconstruct a model with equivalent capabilities. Put differently, no matter how thoroughly massive frontier models come to dominate the market, Open Source AI preserves within society a reproducible foundation that remains within reach of all. That is the essence of a minimum level of digital sovereignty, and the core of “AI sovereignty.”

True Open Source AI also carries enormous economic significance. Its value lies not merely in lowering adoption costs. It lowers barriers to research and development, promotes competition, strengthens bargaining power in procurement, and makes it easier to avoid permanent dependence on a particular vendor. A strong industrial foundation does not mean monopolizing the frontier at all times. It means having a minimum footing that can be rebuilt when necessary. The expansion of Open Source AI means that countries and organizations alike can secure that footing, and that in turn supports healthy competition and diversity across the world.

The opposite of this is what is often called openwashing. Even when something appears open at first glance, calling it “open” when it cannot in fact be studied, reproduced, or modified and sustained is not merely a misuse of language. It misleads policymakers, distorts corporate procurement, and even sends research investment in the wrong direction. If society comes to believe that such systems are Open Source AI, the self-sustaining foundations that could have been secured will wither, and dependence on a handful of dominant firms will only deepen. Openwashing is dangerous because it pretends to grant freedom while in reality taking sovereignty away.

To be sure, one cannot deny that openness itself carries risks. There is an argument that releasing highly capable models expands the scope for misuse and creates national security concerns, and that point cannot simply be ignored. But if those risks are used as a reason to close off the reproducible foundation itself, the price society pays is even greater. Permanent dependence on particular providers, uncritical trust in systems that cannot be audited, and the continued outsourcing of one’s own procurement and research to the judgment of others are forms of lost sovereignty no less serious than the risk of misuse. The real question is not open versus closed. It is how to build institutions that address misuse while preserving openness.

This is where OSAID matters. OSAID is an attempt to distinguish between AI that merely appears to be open and AI that can actually be reproduced, studied, modified, and shared. It is not enough merely to release model weights. What matters is whether others, if they commit the necessary resources, can understand the system, verify it, and, if needed, rebuild it for themselves. The line OSAID draws is a minimum threshold, below which reconstructing an equivalent model becomes impossible in the first place. Openwashing fails to meet even that minimum line, which is why it belongs to a different order of question than debates over whether full disclosure of training data should be required. The former denies the very possibility of AI sovereignty. The latter is only a debate about the best solution once that possibility has been preserved.

Of course, in reality it is not always possible to share all training data as-is. But if the response is to move toward legally prohibiting the sharing of training data across the board, reproducibility will quickly be monopolized by a small number of very large firms. What is needed is not a complete severing, but at the very least richer information about the data, so that third parties retain some ability to audit, compare, and approximate reproduction. Without knowing the sources of the data, its composition, collection policies, filtering, and exclusion criteria, meaningful research, verification, and responsible use become impossible. Even where the training data itself cannot readily be shared, detailed information about that data remains a critical foundation for reproducibility and accountability.

Open Source AI is AI that, from the outset and as far as possible, puts in place the conditions for reproducing the model no matter what happens. For that reason, it brings AI sovereignty to people and to nations around the world.

What the world now needs is not merely powerful AI. It is a minimum AI foundation that every country, every organization, and ultimately humanity as a whole can retain and rebuild when necessary. At a time when frontier technology is rapidly concentrating in a small number of organizations, the value of a reproducible foundation that remains on the side of society is only increasing. Open Source AI is the way of thinking that protects that foundation, and OSAID is the minimum line that defines it. This is why the world needs Open Source AI. It is not merely for the freedom of developers. It is so that people and society, along with the states and organizations that support them, can continue to hold a minimum degree of agency over their own future.

References

Open Source AI Definition: https://opensource.org/ai/open-source-ai-definition

About this article

This document is a restructured version of a short memo I wrote for a global political organization. Because it was written for a specific purpose, it may present a somewhat biased perspective, but I believe it effectively illustrates one of the benefits of Open Source AI. The original text is in Japanese, and the English version is its translation.

Can You Relicense Open Source by Rewriting It with AI? The chardet 7.0 Dispute

Can AI be used to reimplement software with nearly the same functionality, and then even change its license as a result, in what might be described as a form of license washing? This issue has been discussed repeatedly for some time. Yet, setting aside discussions of reimplementation for research purposes, I do not believe there…

Tracing Creative Commons Licenses Across AI: Training Data, Models, Outputs

Copyright issues in AI model training and generated outputs are widely discussed, but less attention is paid to how copyright licenses, the actual mechanism for granting permission, shape these workflows. Unlike traditional software, which is often described as a one-to-one relationship between source code and executable code, AI involves three interlocking layers: input data, the…

The Hidden Risks of NVIDIA’s Open Model License

Recently, regarding the open-weights AI model “Nemotron 3” released by NVIDIA, there are scattered media reports mistakenly describing it as open source. Because there is concern that these reports encourage ignoring the usage risks of the NVIDIA Open Model License Agreement (version dated October 24, 2025; hereinafter referred to as the NVIDIA License), which is…

The Current State of the Theory that GPL Propagates to AI Models Trained on GPL Code

When GitHub Copilot was launched in 2021, the fact that its training data included a vast amount of Open Source code publicly available on GitHub attracted significant attention, sparking lively debates regarding licensing. While there were issues concerning conditions such as attribution required by most licenses, there was a particularly high volume of discourse suggesting…

The Legal Hack: Why U.S. Law Sees Open Source as “Permission,” Not a Contract

In Japan, the common view is to treat an Open Source license as a license agreement, or a contract. This is also the case in the EU. However, in the United States—the origin point for almost every aspect of Open Source—an Open Source license has long been considered not a contract, but a “unilateral permission”…

Evaluating OpenMDW: A Revolution for Open AI, or a License to Openwash?

Although the number of AI models distributed under Open Source licenses is increasing, it can be said that AI systems in which all related components, including training data, are open are still in a developmental stage, even as a few promising systems have emerged. In this context, this past May, the Linux Foundation, in collaboration…

A Curious Phenomenon with Gemma Model Outputs and License Propagation

While examining the licensing details of Google’s Gemma model, I noticed a potentially puzzling phenomenon: you can freely assign a license to the model’s outputs, yet depending on how those outputs are used, the original Terms of Use might suddenly propagate to the resulting work. Outputs vs. Model Derivatives The Gemma Terms of Use distinguish…

Should ‘Open Source AI’ Mean Exposing All Training Data?

DeepSeek has had a major global impact. This appears to stem not only from the emergence of a new force in China that threatens the dominance of major U.S. AI vendors, but also from the fact that the AI model itself is being distributed under the MIT License, which is an Open Source license. Nevertheless,…