How does the fair use doctrine apply to AI training in the US, and how is that different from Europe’s text and data mining (TDM) exception? This is a question we often get, particularly from those that work in multinational organizations.
In the United States, there are over 90 cases in the courts against AI developers by rightsholders. Most of these cases are for the use of protected content for training purposes by AI developers. Decisions are still in the very early stages for these cases, but the key takeaway from them as a whole is that they all take different approaches. Different facts and different evidence have been presented in each of those cases, so the courts are conducting case-by-case analysis, based on the evidence before them to make determinations. Different approaches have been taken in all these cases.
Courts are probing purpose and market harm. They are not granting blanket permission. We expect a very prolonged period of uncertainty. (And my personal opinion is that we probably will never get that certainty that businesses need.)
We are starting to see some settlements, including the largest copyright settlement ever: the Bartz v. Anthropic case. It’s a settlement of $1.5 billion for using pirated works to create a library of content. And we’re also seeing other cases settle, particularly in the music space.
Now in the EU, it’s a different approach. There is no fair use in the EU. There is a text and data mining exception, and there are two different exceptions that we’re dealing with here:
- If you’re using TDM for research purposes, and you are a qualified institution and you have lawful access to those works, then the exception allows you to do TDM without needing to get additional authorizations.
- If your TDM is for commercial purposes, you must first check whether the rightsholder has reserved their rights (or “opted out” is how sometimes people refer to it.) If the rightsholders have reserved their rights, you cannot use those works for TDM purposes in a commercial setting.
The bottom line? Between the uncertainty of the United States’ fair use and the different approach in the EU, as well as all the reservations of rights, what we’re seeing is a move towards licensing uses of works for training and other uses in AI systems.
