The New York Times sues OpenAI and Microsoft for Copyright Infringement in AI Models Development

The New York Times Company (“The Times”) has initiated legal proceedings against Microsoft Corporation and various OpenAI entities, accusing them of illegally using its copyright-protected material. According to the lawsuit, this material was used in the development of AI models like ChatGPT and Bing Chat, which allegedly violates The Times’s copyrights. The case underscores the potential negative effects of AI on traditional media, including potential revenue losses and a decline in reader trust.

The Times, a recipient of 135 Pulitzer Prizes and publisher of over 250 original articles daily, emphasizes the resource-intensive nature of crucial news reporting. They generate significant revenue through content licensing, including some free licenses for academic and non-profit use. The lawsuit alleges that ChatGPT used The Times’s articles to train the system. Once the system is trained, it presents those articles to users, either in whole or in summary form, and imitates their style. Truly interesting examples of the alleged infringement were presented where only small alterations were made by ChatGPT.

Data Training

The first step in training an AI model is to gather a large dataset. This data can be anything from text, images, and sounds to more complex data like user interactions or sensor readings. The quality and quantity of this data significantly impact the model’s performance. Earlier versions of ChatGPT used substantial content from The Times (e.g., a dataset with 333,160 entries led to The Times). Later versions diversified their data sources (mostly social media posts and comments), yet The Times remained a critical source of reliable information. These entries were used for training without The Times’s permission. One of the sources used is a large collection of online material called “Common Crawl,” which the suit alleges contains information from 16 million unique records from sites published by The Times. Some of the articles were copied in their full length.

According to the lawsuit, Microsoft developed specialized systems to replicate the content of The Times for AI models. To train the GPT models, Microsoft and OpenAI worked together to create a complex and customized supercomputing system that could store and replicate copies of the training dataset, including The Times’ content. For the purpose of training GPT models, allegedly, The Times articles were copied and ingested multiple times.

The defendants publicly defend their actions as fair use, arguing that the use of copyrighted material in AI training serves a transformative purpose (the AI-generated output has a different character than the input). However, The Times argues this is not transformative, as it involves creating competing products without compensation.

(Non)Profit

A competing product? OpenAI initially declared as an altruistic organization, saw a shift in 2019 when an affiliate company was established for profit. Since transitioning to a for-profit model, OpenAI ceased open-source releases of its models, starting with GPT-3 in 2020, keeping subsequent model designs and training details secret. As of August 2023, OpenAI was on pace to generate more than USD 1 billion in revenue over the next twelve months. The market valuation of ChatGPT now is as high as USD 90 billion. Users might get the same or similar articles in both The Times and ChatGPT, which could lead to market disruption.

Negotiations re Licensing

Different standpoints led to the negotiations. The Times, with numerous other media outlets, began talks regarding the price and terms of licensing of the content to the AI creators. The negotiations focused on a concept of partnership around the real-time display of The Times Articles (with attribution) in ChatGPT, in which The New York Times would gain a new way to connect with their existing and new readers, and ChatGPT users would gain access to their reporting. However, the negotiations with The Times have not resulted in a settlement. On the other hand, The Associated Press reached an agreement.

Claims

The Times contends that the success of OpenAI and Microsoft’s AI models heavily relies on copyright infringement.

The professional public already talks about the “hallucinations” Chat GPT has. The lawsuit also addresses this issue along with the false attributions to The Times, causing commercial harm.

The lawsuit seeks to address various legal claims, including vicarious and contributory copyright infringement and trademark dilution and requests the destruction of all infringing AI models that are based on The Times’s articles.

AI is neither good nor bad, and there are still nuances in its creation and use. This court case will certainly make more aspects clear and enable more legal security in the field of new technologies.

OpenAI responds

On 8 January 2024, OpenAI published a blog post with its position claiming:

They collaborate with news organizations and are creating new opportunities;
Training is fair use, but we provide an opt-out because it’s the right thing to do;
“Regurgitation” (providing almost unchanged articles) is a rare bug that they are working to eliminate;
The New York Times is not telling the whole story, emphasizing the content of the negotiations and good faith actions of the defendant when they took down Chat GPT to solve bugs that tackled The Times.

The official response before the court in New York is still expected.

The entire tech and IP world is watching how these interdependent interests will be resolved.

The information in this document does not constitute legal advice on any particular matter and is provided for general informational purposes only.

By Rastko Petakovic, Senior Partner, Goran Radosevic, Partner, and Nikola Kliska, Senior Associate, Karanovic & Partners

Sidebar

Navigation

Avellum Advises Grupa Pracuj on Merger Clearance for Investment in Work.ua

Lambadarios Advises Halcyon Equity Partners on Investment in AlfaOmega Pharma Logistics

Greenberg Traurig, Clifford Chance, CMS and Wardynski & Partners Advise on CVI and Flexam Invest's CTL Logistics Bond Financing

BBH Advises SICO on Sale of Stake in Czech JV to Trelleborg Group

Walless and Cobalt Advise on Millerhawk's Acquisition of Retail Property Portfolio in Estonia

NNDKP Defends OMV Petrom’s Neptun Deep Project Against Greenpeace Challenge

Tabakov, Tabakova & Partners Advises on PDO Registration for Natural Mineral Water Hissarya

Kinstellar Advises Mitiska REIM on EBRD’s Entry into Slovak Retail Parks Joint Venture

A&O Shearman and White & Case Advise on EUR 500 Million Notes Issuance by Ceske Drahy

The Ultimate Website Checklist for Law Firms

An Uptick Despite Ongoing Turbulence in Georgia: A Buzz Interview with Ketti Kvartskhava of BLC Law Office

The Tax Burden in Slovenia: A Buzz Interview with Pia Florjancic Pozeg Vancas of Peterka Partners

A Coral Anniversary: NNDKP Law Firm Reflects on Its Story and Legacy in Romania

Serbia's Protests, Slowdown, and First Issuance: A Buzz Interview with Maja Jovancevic Setka of Karanovic & Partners

Hot Practice in Hungary: Tamas Feher on Jalsovszky's Dispute Resolution Practice

Inside Insight: Simone Quantschnigg of Vamed Care

Inside Insight: Konstantinos Argyropoulos of Space Hellas

Inside Insight: Natalia Lysa of Nestle

Inside Insight: Filip Knezevic of Vezuv

Ukrainian GCs on Trends in Hiring Local Counsels and Use of Legaltech

2025 CEE General Counsel Summit Sneak Peak: Interview with Davor Majstorovic of AMB Legal

The New York Times sues OpenAI and Microsoft for Copyright Infringement in AI Models Development

Tools

Typography

News Categories

Latest News

More Analysis

Latest Analysis and Commentary

In-House Categories

Latest In-House

Tools

Typography

Share This