AI-robot peker mot et symbol for lovparagraf på 3D bakgrunn

Creative artificial intelligence vs. copyright

At the turn of the year 2022/2023, the company OpenAI put artificial intelligence (AI) really on the agenda, with its service ChatGPT. Artificial intelligence brings with it both challenges and opportunities, and many have probably caught everything from doomsday prophecies to great promises about the technology's potential.

What significance AI will have for humanity in general, we at the Norwegian Industrial Property Office (NIPO) probably cannot answer. What we can say, however, is what significance artificial intelligence will have for intellectual property rights.

It is particularly generative AI that has gained focus in the AI wave that has washed over the world recently. Generative AI is artificial intelligence that can produce new, original content – such as text, images and music. Such an AI model manages this because it is exposed to large amounts of data related to the task it was designed for, such as text or image production, and it is programmed to find patterns and relationships in the data.

These patterns and connections that it has learned it can apply again when it comes to generating new content. A service such as ChatGPT, for example, has been trained on huge amounts of text, and has learned patterns and connections in human language from all this text, and can thus formulate itself in a very human-like way.

Almost all material on which such an AI is trained is protected by copyright, and the material the AI in turn produces is material that has traditionally qualified for copyright protection.

Does training generative AI on copyrighted material infringe copyright?

Training of AI models, which we barely considered at the beginning, involves something called "text and data mining", or "text and data mining" - often abbreviated to "TDM". This is an umbrella term that refers to the automatic process where large sets of text and data are analyzed to discover patterns and relationships. Once the TDM process is done, the AI model can apply what it has learned to generate new material.

When carrying out TDM, it will often be an inevitable step in the process that some form of copying of the material you wish to extract from takes place. In principle, therefore, such extraction infringes copyright. However, the point of text and data mining is not to copy this material, but to extract the ideas and facts behind it. Copyright is not intended to protect and protect ideas and facts. Therefore, it is perhaps somewhat paradoxical that one infringes copyright when carrying out text and data mining.

It didn't take long for generative AI to start making headlines, for its use of copyrighted material to start generating major debate. On the one hand, you have those who believe authors should be compensated for the use of works in AI training. And on the other side you have those who believe that one should be able to perform TDM and training of AI more freely on copyrighted material, since the purpose is really only to extract pure ideas and facts, which in themselves are not meant to be protected.

As TDM basically infringes copyright, but does not have the purpose of copying, the EU has found it necessary to come up with some special rules that regulate exactly this. These rules will have an impact on the training of AI on material protected by copyright.

New EU/EEA legislation

An EU directive on copyright in the digital sphere from 2019, called the Digital Market Directive, contains rules related to TDM in the directive's articles 3 and 4. Since TDM techniques are used to train generative AI, these rules will be of great importance when comes to the legality of training AI on copyrighted material. This directive will shortly be implemented in Norwegian legislation - probably during 2024.

The rules in this EU directive contain a limitation in copyright, so that text and data extraction becomes legal as a starting point. According to Article 3 of the directive, research institutions will be able to freely exercise text and data mining in all copyright-protected material. Article 4 states that other actors, such as commercial enterprises, can carry out text and data mining in all copyright-protected material, as long as the author(s) has not made an explicit reservation against this.

Authors thus have an opportunity to reserve against their work being used in TDM in a number of cases, and therefore also in training of generative AI. A challenge here, however, is that there are currently no commonly agreed guidelines that say what such a reservation should look like. Since such reservations can initially appear in many different ways, it will be a challenge for the AI models that are trained on copyright material to capture the reservations. There is thus a risk that the AI is mistakenly trained on material that is not permitted to be trained on. Time will tell whether such guidelines for reservations are put in place, and what they will look like.

The EU countries have also recently reached a political agreement on the draft of their own AI law (regulation) called the "Artificial Intelligence Act", or "AI Act" . This regulation, which will enter into force in 2025 at the earliest, is intended to be a harmonized framework with common rules and safety measures for the development and use of artificial intelligence in a number of areas.

When it comes to copyright, it appears that the AI Act will state that AI providers must respect rights holders' ability to object to text and data mining under the 2019 Digital Markets Directive, and thus also to AI training. The fact that the TDM rules from 2019 will have an impact on AI training has long been assumed, but this seems to be made explicit by the AI Act. Furthermore, the AI Act requires that AI providers must make visible which copyright-protected material has been used in the AI training. In this way, authors will have the opportunity to know when their work is used in AI training, and one will then be able to make an informed choice if one wishes to reserve against TDM as part of AI training on later occasions.

We believe that these new rules from the EU can help create a clearer framework when it comes to the relationship between AI training and copyright.

Does anyone get copyright for what generative AI creates?

We have now looked at the training of generative AI and whether this infringes copyright. Another side of the same issue is whether the material created by AI gets copyright.

Generative AI can create works that typically fall within the categories of works protected by copyright, such as text, music and visual art. But will someone get an exclusive right in the form of copyright to these works - and if so, who would get this exclusive right?

Can the AI itself obtain copyright for what it creates?

Since it is, after all, the AI that creates the work, it is perhaps most natural to first ask whether the AI model itself can obtain copyright for the works it creates. Section 2 of the Intellectual Property Act requires that there must have been an "individual creative effort" in order for someone to obtain copyright in a work. This requirement gives the impression that copyright does not apply to works created by machines. Although they are referred to as having "intelligence", AIs are really just advanced computer programs. In Norwegian legal theory, there also seems to be agreement that only people can create intellectual property. One can therefore safely conclude that an AI model cannot itself be copyrighted.

Can the person who has developed the AI obtain copyright for what it creates?

Another question is whether the person who developed the AI should be granted copyright. But those who have developed the AI model itself have really only programmed a good algorithm. Beyond that, they have no further knowledge of the specific works that the AI model will produce. As mentioned, there must have been an "individual creative effort" for someone to obtain copyright. If one can say that the developers of a service like ChatGPT have exerted such an effort in each of the millions of texts that the service generates daily, it would be stretching it too far.

Does the person who uses an AI to create a work get copyright?

A third question that can be asked in connection with copyright to AI-generated material is whether the user of AI can obtain copyright. In some cases, just a few keystrokes from the user will generate works. This obviously cannot be said to be an "individual creative effort". But what if the user of an AI is extremely specific in what he orders from the AI, is completely aware of what end result he wants, and actually uses the AI more as an artistic tool than an end station? Can this justify a right? Here there can probably be room for doubt in some cases.

Humanoid AI robot paints a composition on a canvas in studio
Humanoid AI robot paints a composition on a canvas in studio

Can content-generating AI infringe copyright?

Now we have seen that generative AI, after being trained on thousands of already existing works, has learned to generate its own, new works. But what happens if generative AI produces something identical or highly similar to something that already exists and is protected by copyright?

As we have seen, it is not the case that what AI produces is a result of "cut and paste". Generative AI has learned the underlying patterns found in existing works, and uses this to produce something entirely new. For example, ChatGPT has become so good at understanding the system in human language that it has learned to generate new sentences all by itself.

If such an AI creates something similar to another intellectual work, but it is a form of work where one has few creative options to arrive at the final result, the AI work may perhaps be regarded as an accidental double creation. A double creation is a work which is identical or very similar to another, but which has been created completely independently of the other. If it is in fact such a duplicate production, it will not constitute an infringement of copyright.

But if an AI reproduces something that is included as part of the training material, and there has also been a lot of room for choice, it will take quite a bit more to prove that there has not been an infringement of copyright here.

The road ahead – the road is created as you walk

One of the biggest challenges when it comes to the interaction between legislation and technology is that technology develops much faster than it is possible to adopt new laws. But with both the EU rules on text and data mining from 2019, and the forthcoming AI Act, important steps have now been taken to prepare for the challenges between generative AI and copyright. However, there is still much that is unclear, and some of the way will probably have to be made up as you go.

This was also the case when the internet became public property a few decades ago. This created major copyright questions and challenges, as the AI explosion has done today. This was solved both through the application of existing regulations and through special adaptations where there was a need.

For information, the Ministry of Culture is the department for copyright in Norway. But at the NIPO we naturally follow the development and relationship between copyright and AI, as a competence center for intellectual property rights.

Related articles