Skip to main content
European Commission logo
IP Helpdesk
News blog22 December 2023European Innovation Council and SMEs Executive Agency4 min read

OpenAI Partners With News Publisher Axel Springer - Meta Faces Claims by Authors For Infringing Copyright

News

OpenAI in a new partnership with news publisher Axel Springer

On 13 December, Open AI and Axel Springer, the renowned global news publisher, announced a strategic partnership. This agreement aims to provide users with exclusive summaries of Axel Springer content in response to their queries via ChatGPT, setting a precedent for the relationship between artificial intelligence and journalism.

Axel Springer SE, a media and technology company with a presence in more than 40 countries, has evolved from a traditional print media company into a leading digital content provider in Europe. Currently headquartered in Berlin, the company employs more than 18,000 people worldwide. It also owns well-known brands such as Bild, Welt, Insider and Politico, as well as its classified advertising portals StepStone Group and AVIV Group.

The agreement between Axel Springer and OpenAI comes at a time when news publishers are considering legal action against technology companies that they claim are infringing their copyrights by using their content without permission to train large language models. This scenario has led publishers to sue for copyright infringement, seeking compensation for the use of their content to train AI models.

This agreement therefore offers a solution to the issue. In particular, abstracts will be available in ChatGPT as soon as the original articles are published. As OpenAI points out, this synchronisation will ensure that the latest news is an integral part of the user experience. This content is expected to be integrated in the first quarter of 2024, heralding a new era of AI-powered news delivery.

As part of the agreement, content from Axel's brands will receive a "favourable position" in ChatGPT's search results. This will make its content more accessible and visible to users, helping to strengthen the digital presence and appeal of the brands involved.

OpenAI will compensate financially Axel Springer for the use of its content to train the extensive language models that power ChatGPT. This financial commitment covers both Axel Springer's current and archived material. The financial details of the agreement are currently being kept confidential, but it is understood that it will last for several years and will not be exclusive.

Despite some reservations among information professionals about adopting generative AI technology, due to its propensity to generate inaccurate information and the challenges associated with distinguishing between human-generated and AI-generated content, the deal with Axel Springer is the second between OpenAI and a well-known news publisher. In July, OpenAI reached a similar deal with the Associated Press, in which AP licensed part of its news archive to Microsoft in exchange for access to OpenAI's technology and product expertise. It should be noted that this agreement, unlike the one with Axel, is not aimed at the mere exhibition of content.

This horizon points to a closer collaboration between artificial intelligence and the media, based on a strong intellectual property framework that benefits both tech companies and news publishers by promoting an ethical and equitable exchange of information.

 

Meta faces claims for training its AI with copyrighted books

The legal battle between Meta Platforms, the parent company of Facebook and Instagram, and well-known authors such as Sarah Silverman and Michael Chabon is taking place against an interesting backdrop: two copyright infringement lawsuits have been filed, both alleging that Meta trained its artificial intelligence models on authors' works without their proper permission.

To put this in context, Meta launched LLamA 1, its large-scale linguistic model, in February 2023, at the same time as openly sharing the list of datasets used for its training including the prominent "Books3 section of ThePile", a public dataset containing nearly 200,000 books. Then, in the summer, it released LLamA 2. Unlike what it had done for its predecessor, Meta chose not to disclose the training data associated with Llama 2, raising questions and speculation in the industry.

On 11 December a lawsuit was filed, supported by chat logs of a Meta researcher discussing the acquisition of the dataset on a Discord server. This lawsuit highlights Meta's decision to use an online resource to feed its language models, despite previous warnings from its legal team.

Messages quoted by the authors between Meta AI researcher suggest that there may be a "strong case for free use" of the dataset, but Meta's lawyers recommended that it not be used. Thus, the authors argue that Meta ignored internal warnings and recommendations not to use Books3, and chose to include the dataset in LLamA 1 training.

The lawsuit also highlights the lack of transparency in the training process for LLamA 2. Although the company justifies the non-disclosure of the training datasets on competitive grounds, the authors suggest that this explanation could be a pretext to avoid scrutiny of those whose copyrighted works were copied and incorporated into the training process for LLamA 2.

In a broader context, this case between Meta and the authors highlights an emerging trend of lawsuits against technology companies for the unauthorised use of copyrighted works in the training of generative AI models which has been covered repeatedly in our past news posts.

Details

Publication date
22 December 2023
Author
European Innovation Council and SMEs Executive Agency