The Communications and Digital Committee said Friday that the government should introduce new laws to bring a definitive end to tech firms using copyrighted works without permission while developing artificial intelligence. It said the government "cannot sit on its hands" on the issue, which licensing organizations have since echoed.
"The point of copyright is to reward creators for their efforts, prevent others from using works without permission and incentivize innovation," the committee said. "The current legal framework is failing to ensure these outcomes occur and the government has a duty to act. It cannot sit on its hands for the next decade until sufficient case law has emerged."
The committee — which focuses on the media, digital and creative industries — said large language models are a key source of potential infringement. These models, trained using vast swaths of information, learn the relationship between different sets of data to predict sequences, the report said. Chatbots such as OpenAI's ChatGPT are one key use of the tech.
But the government "could not articulate its current legal understanding" of the potential for copyright infringement where tech companies develop their models using copyrighted works, according to the report.
The committee said tech companies should pay copyright owners for the vast amounts of data that they use to develop these models.
It said the government should offer a way for rights holders to check for copyright breaches, while also calling for investment into new "curated" datasets that will be commercially attractive to tech firms.
Some content aggregators already run businesses offering access to "trillions of words," the committee said.
"Large language models rely on ingesting massive datasets to work properly, but that does not mean they should be able to use any material they can find without permission or paying rights holders for the privilege," committee Chair Baroness Stowell of Beeston said. "This is an issue the government can get a grip of quickly and it should do so."
The Authors' Licensing and Collecting Society, or ALCS, welcomed the lords' report Friday. The group, which represents writers' copyright interests, said the current situation is "profoundly unfair" to copyright holders.
"ALCS stands for the principle of fair payment for use, and that creators should be offered a choice in terms of what happens to their works," Barbara Hayes, the group's CEO, said. "It is encouraging to see those ideals echoed by this report."
The Design and Artists Copyright Society, or DACS, is an organization representing visual artists' rights. It also welcomed the "timely" report Friday, saying the need to reward creators for their efforts is "paramount."
"The government has a duty to act and cannot simply refer rights holders to lengthy and costly court processes," the organization said.
"This government is committed to supporting the AI and creative industries sectors so that they continue to flourish and are able to compete internationally," a Department for Science, Innovation and Technology spokesperson told Law360 on Tuesday.
"We recognize this is a contested area, which is why we will continue to engage with AI and creative sectors on a shared approach that allows them both to grow. We will set out further proposals on the way forward soon."
On its understanding of the current legal position on copyright infringement within large language models, the government said that "there are live cases that are before the courts," adding that "it would not be appropriate" for it to comment on these.
The U.S. Copyright Office said Jan. 25 that it is mulling licensing options for creators seeking compensation. It said the search for a licensing model is "one of the major issues the office is looking to address," suggesting that a compulsory licensing arrangement — like those used in the music industry — is one option.
In October, a class of authors filed a copyright infringement complaint in a New York federal court against Meta, Microsoft, Bloomberg and artificial intelligence research institute EleutherAI. The claim alleges that the group of companies trained their AI tools on datasets that pirated around 183,000 e-books.
--Additional reporting by Andrew Karpan and Kelly Lienhard. Editing by Stephen Berg.
Update: This story has been updated with comment from the government.
For a reprint of this article, please contact reprints@law360.com.