The EU Artificial Intelligence Act was passed by a vote in the European Parliament on May 11. With the sudden appearance and popularity of generative AI tools like ChatGPT, DALL-E, Google Bard, and Stable Diffusion, it’s not surprising that the European Commission has made some changes to the text in the last few weeks to address these technologies.
Most importantly, the changes added new transparency and disclosure requirements for big language models, a type of AI algorithm called “general-purpose AI systems” that use large datasets and machine learning to understand and create content.
Generative AI technologies like ChatGPT and DALL-E have trained on datasets made up of data scraped from the internet that is open to the public. Because of this, artists and creators have been worried about how big language systems “steal” original work and how the companies that made those systems don’t pay them for it.
This rule aims to protect artists and people who own copyrights. But it could hurt American AI companies that do business in Europe by accident. So, American AI companies that want to offer services to EU people should be aware of this and look into other ways to collect data carefully and intentionally.
Article 28b 4(c) of the EU AI Act says that providers of generative AI systems must “document and make publicly available a sufficiently detailed summary of the use of training data protected under copyright law, without prejudice to national or Union copyright legislation.” The trouble is that it is impossible to tell what each piece of data, piece of generated content, or single image is. For example, GPT-3 was trained on 45 terabytes of text data. With such big and varied datasets, it would be impossible to find specific data segments.
In an interview with Forbes in September 2022, David Holz, the founder of MidJourney, said, “There isn’t a way to get a hundred million images and know where they’re coming from.” He then said, “There’s no list.” AI companies in the US and other places worldwide could be in trouble if this rule passes. If you don’t follow the law, you could be fined up to 30 million euros or 6 per cent of your annual income, whichever is higher. Because of this, AI writers need to develop ways to record training data.
A quick look at this part of the act says that original creators and people who own the rights to their works might be able to get fair pay for them. But because the rules are written in a general way, it’s hard to tell how specific companies will have to be in their summaries. So, it’s unclear how creators will know if their work is being used in the training collection. This could lead to cases with no reason to be filed, especially if companies that follow the rules give too much information in their “detailed summary of the use of training data.” “It is hard to know what a “sufficiently full summary of the use of training data” is and how often it needs to be updated,” say Riede, Pratt, and Hofer. This part of the act should have more details so that this doesn’t happen again.
Also, the differences between EU and US copyright laws are likely to cause misunderstanding and inconsistencies among companies trying to follow the rules. The EU does not have a copyright registry, and anyone who “creates literary, scientific, or artistic work…automatically has copyright protection, which starts the moment they create their work.” That means companies making big language models for AI should be careful about using content from EU creators. In the US, however, there is an official way to register your copyright, but not everyone can do it. In particular, the US Copyright Office said on March 16, 2023, that works made by AI were not qualified for copyright.
This sentence shows how hard it is to determine which kinds of information are protected by copyright law.
OpenAI, Microsoft, and Github were all sued for copyright violations last November. This has already led to several cases. Copyright cases have also been filed against AI art tools such as Stable Diffusion and MidJourney. If the EU AI Act goes into effect, these companies may be sued more often and be fined much more.
As the EU AI Act starts to take effect, American AI companies must look carefully at what it means for them. It’s still unclear if the EU AI Act’s standards for transparency will lead to new ideas that will help people comply. Businesses in the US and the Western market would prefer a less rigid way to regulate creative AI models. The differences between EU and US copyright laws make it hard for companies to work together in both places, and the new changes to the AI Act make this situation even worse.
EU rules that are coming soon could change the way AI companies do business. They will have to get ready for this.