OpenAI, New York Times Debate Discovery Of Reporters’ Notes In AI Copyright Suit

·

(July 8, 2024, 1:51 PM EDT) -- NEW YORK — In letter briefing, the New York Times Co. (NYT) and OpenAI battle over whether copyright claims linked to artificial intelligence ChatGPT’s alleged reproduction of news articles permit discovery into reporters’ notes and associated materials underlying the stories.

(The New York Times Company v. Microsoft Corporation, et al., No. 23-11195, S.D. N.Y.)

(NYT’s letter available.  Document #46-240710-052B.  OpenAI’s letter available.  Document #46-240710-053B.)

The latest of the briefs were filed July 3.

“Notably, [OpenAI] does not cite a single case in which a news organization has been ordered to produce such materials. [OpenAI] is not entitled to unbounded discovery into nearly 100 years of underlying reporters’ files, on the off chance that such a frolic might conceivably raise a doubt about the validity of The Times’s registered copyrights,” NYT argues.  NYT says its promise to produce the copyrighted works and the evidence of copyright ownership suffices.

NYT filed the suit in December 2023 in the U.S. District Court for the Southern District of New York.  The suit names as defendants Microsoft Corp. and OpenAI Inc., OpenAI LP, OpenAI GP LLC, OpenAI LLC, OpenAI Opco LLC, OpenAI Global LLC, OAI Corp. LLC and OpenAI Holdings Inc. (collectively OpenAI).  NYT claims that the defendants unlawfully trained ChatGPT on millions of stories copyrighted by NYT.  ChatGPT is built on a large quantity of sources, but NYT articles were given “particular emphasis” in training the large language model AI, NYT claims.

NYT alleges a count of copyright infringement under Title 17 Section 501 of the U.S. Code, 17 U.S.C. § 501, against all defendants; a count of vicarious copyright infringement against Microsoft, OpenAI Inc., OpenAI GP, OpenAI LP, OAI Corp. LLC, OpenAI Holdings LLC and OpenAI Global LLC; a count of contributory copyright infringement against Microsoft; a count of contributory copyright infringement against all of the defendants; a count of violation of Section 1202 of the Digital Millennium Copyright Act, 17 U.S.C. § 1202, against all defendants; a count of common-law unfair competition by misappropriation against all defendants; and a count of trademark dilution under Title 15 Section 1125(c) of the U.S. Code, 15 U.S.C. § 1125(c), against all defendants.

‘Hack’

In a Feb. 26 motion to dismiss and memorandum in support, OpenAI says ChatGPT “is many things:  a revolutionary technology with the potential to augment human capabilities, fostering our own productivity and efficiency; . . . an accelerator for scientific and medical breakthroughs; . . . a mechanism for making existing technologies accessible to more people; . . . an aid to help the visually impaired navigate the world; . . . a creative tool that can write sonnets, limericks, and haikus; . . . and a computational engine that reasonable estimates posit may add trillions of dollars of growth across the global economy.”

“The truth, which will come out in the course of this case, is that the Times paid someone to hack OpenAI’s products.  It took them tens of thousands of attempts to generate the highly anomalous results that make up Exhibit J to the Complaint.  They were able to do so only by targeting and exploiting a bug (which OpenAI has committed to addressing) by using deceptive prompts that blatantly violate OpenAI’s terms of use.  . . .  And even then, they had to feed the tool portions of the very articles they sought to elicit verbatim passages of, virtually all of which already appear on multiple public websites.  Normal people do not use OpenAI’s products in this way,” OpenAI says.

On March 4, Microsoft filed its own motion seeking partial dismissal, warning against doomsday predictions about technological innovation.  NYT filed its opposition on March 11.

Discovery Issues

In a letter to the court filed May 23, OpenAI asks for an informal discovery conference, complaining that NYT was not willing to commit to a deadline for responding to OpenAI’s request for production.  OpenAI says discovery was expedited on NYT’s own motion.  Also pending are motions to consolidate the case with one brought by other news publishers.

In a letter filed May 28, NYT proposed a July 31 deadline.  OpenAI’s citation to its own production deadlines ignores the fact that much of the 15 requests to which it was required to respond overlapped with previous requests.  OpenAI now asks that NYT respond to 61 requests in less time than OpenAI took to respond to 15 overlapping requests.  As a result, “OpenAI is comparing apples to oranges,” NYT says.

“From the beginning, The Times has pushed to proceed as efficiently as possible, serving document requests on February 23, 2024 — the first day on which discovery could be served.  The Times invited Defendants to do the same, but they did not.  Defendants instead threatened to stay discovery pending their motions to dismiss.  While OpenAI eventually reversed course, its delay in serving document requests is the sole reason why it cannot take advantage of the June 14 substantial completion deadline.  Had OpenAI served discovery sooner, that deadline would apply to The Times’s initial productions as well,” NYT says.

With all that still pending, OpenAI filed a letter on July 1 again requesting an informal discovery conference over NYT’s refusal to produce “critical discovery” about the creation and ownership of the articles at the heart of the case.  “Discovery into those copyrighted works is directly relevant both to the Times’s claim of copyright infringement and to OpenAI’s defenses (such as fair use, which looks at, inter alia, various aspects of the works at issue),” OpenAI says.

NYT cannot claim copyright over anything it copied from another entity or any works in the public domain, OpenAI says.  OpenAI urges the court to order NYT to produce documents establishing which of its works are entitled to copyright protection and which are not.  NYT claims that it invests significant time and money into its investigative reporting.  It has therefore made the issue of how the works were created a part of the case, and its boilerplate objections to the request for discovery are meritless, OpenAI says.

NYT claims that the request is overly broad, but its argument tells the court nothing, OpenAI says.  Such a lack of detail does not warrant refusal to produce documents.  Regardless, it has been made clear to NYT that OpenAI is seeking reporters’ notes, interview memos and related records for each work over which NYT alleges a copyright.  To the extent that such material would be protected, NYT has agreed to inform OpenAI.  But NYT has not made any such assertion so far, OpenAI says.

‘Irrelevant, Improper and Harassing’

In its response, NYT calls OpenAI’s request for confidential reporters’ files “irrelevant, improper and harassing.”

The request for reporters’ notes and related material, allegedly to determine whether copyright applies, is unprecedented and “turns copyright on its head.  OpenAI cites no caselaw permitting such invasive discovery, and for good reason.  It is far outside the scope of what’s allowed under the Federal Rules and serves no purpose other than harassment and retaliation for The Times’s decision to file this lawsuit.”

Individual story-by-story newsgathering has no impact on the copyright protections covering the millions of copyright registered news articles, NYT says.  Nothing in copyright law would permit OpenAI to investigate the creation of the works.  Even in the unlikely event that access to a reporter’s notes could be used to show that 90% of a copyrighted article was quotation, it would not change the fact that the resulting article would be covered by copyright, and NYT could sue OpenAI for its apparent copying of “entire works verbatim,” NYT says.

NYT says it has no obligation to turn over any communications with the U.S. Copyright Office.  Evidence of copyright suffices unless there is some underlying issue with the issuance of the copyright, it says.

Counsel

NYT is represented by Ian Crosby of Susman Godfrey LLP in Seattle, Davida Brook and Ellie Dupler of the firm’s Los Angeles office and Elisha Barron and Tamar Lusztig of its New York office and Steven Lieberman, Jennifer B. Maisel and Kristen J. Logan of Rothwell, Figg, Ernst & Manbeck PC in Washington, D.C.

OpenAI is represented by Joseph C. Gratz, Tiffany Cheung, Joyce C. Li and Melody E. Wong of Morrison & Foerster LLP in San Francisco; Allyson R. Bennett, Rose S. Lee and Alexandra M. Ward of Morrison & Foerster in Los Angeles; Andrew Gass and Joseph R. Wetzel of Latham & Watkins LLP in San Francisco; Sarang Vijay Damle of Latham & Watkins in Washington; and Allison L. Stillman of Latham & Watkins in New York.

Microsoft is represented by Annette Hurst of Orrick, Herrington & Sutcliffe LLP in San Francisco and Christopher Cariello and Marc Shapiro of the firm’s New York office.

(Additional documents available:  OpenAI’s letter to the court.  Document #46-240605-029B.  NYT’s letter to the court.  Document #46-240605-030B.  NYT’s opposition to motions.  Document #46-240403-009B.  Microsoft’s memorandum in support of motion to dismiss.  Document #46-240403-010B.  OpenAI’s motion to dismiss.  Document #46-240306-050M.  OpenAI’s memorandum in support of motion to dismiss.  Document #46-240306-049B.  New York Times complaint.  Document #46-240105-033C.)