OpenAI Wants to See All of NYT Reporters’ Notes and Memos

July 18, 2024

0 Views 0

OpenAI Wants to See All of NYT Reporters’ Notes and Memos

The New York Times’s copyright lawsuit against OpenAI is getting ugly. As part of the discovery process, OpenAI wants the Times to turn over its reporter’s “notes, interview memos, records of materials cited, or other ‘files’ for each asserted work.”

The Grey Lady sued OpenAI and Microsoft in 2023 over copyright infringement. ChatGPT, OpenAI’s premiere product, is a plagiarism machine. The large language model produces material by gobbling up every piece of written work it can, remixing it, and then cobbling something together resembling the originals it devoured.

This hasn’t sat well with the journalists, and the Times sued late last year. The legal battle’s been going on ever since. According to court records, OpenAI has argued that ChatGPT is only reproducing Times articles because the Times is “prompt hacking” the ChatGPT. Basically, the tech giant is claiming the Times has tricked the model into plagiarizing.

OpenAI has also publicly said that its LLM is part of the future of news. “Our goals are to support a healthy news ecosystem, be a good partner, and create mutually beneficial opportunities,” it said in a blog post defending itself from the lawsuit.

As spotted by Bloomberg Law, OpenAI is now asking for the Times to produce every scrap of information that went into the production of the articles that the Times is alleging have been stolen. Court records filed on July 1 tell the story. OpenAI’s argument seems to be that copyright claims are only valid when the works are original to the author. “In other words, the Times cannot pursue a claim for infringement over any part of a copyrighted work that is not original to the Times, as would be the case if the Times copied another’s work or elements in the public domain,” OpenAI argued in the court records.

“Accordingly, the Court should order the Times to produce documents sufficient to show what portions of the asserted works are original to the Times and what are not,” it said. “OpenAI seeks precisely these documents through, which requests ‘documents sufficient to show each and every written work that informed the preparation of each of Your Asserted Works, regardless of its length, format, or medium.’”

OpenAI said it would be satisfied with all the notes, interview memos, and records of material cited related to around 10 million stories. It’s an onerous ask, and one probably designed to delay the trial and drain the Times of its resources. The record-keeping practices of journalists are varied. Some keep meticulous and detailed notes, others have mountains of half-filled and coffee-stained notebooks buried somewhere in their closet.

Finding all the stuff, organizing it, and cataloging it would take years. That might be the point. OpenAI is backed by Microsoft, a multi-billion dollar corporation. The Times is wealthy, but it’s not Silicon Valley wealthy.

It is undeniable that LLMs like ChatGPT hoover up the work of journalists and repurpose it. It’s up to individual news agencies to decide how they’ll deal with that. A depressing number of them have bent the knee and struck a deal with OpenAI and others. The Associated Press, Axel Springer, and The Atlantic have all made agreements with OpenAI.

Perhaps each is hoping they’ll be the last creature devoured in the lion’s cage.

Source link