Salesforce (NYSE:CRM) is facing a proposed class action lawsuit by two authors who allege that the company used thousands of books without permission to train its AI software.
Authors E. Molly Tanzer and Jennifer Gilmore said in the complaint that Salesforce infringed copyrights by using their work to train its xGen AI series of large language models, or LLMs, to process language, Reuters reported.
“The training dataset for these models—described as “legally compliant” by Salesforce—consists of the notorious RedPajama and The Pile datasets that contain copies of these unlawfully-obtained copyrighted books,” said the lawsuit. “Both RedPajama and The Pile contain the Book3 corpus, which contain hundreds of thousands of copyrighted books that were acquired without the authorization or consent of the authors.”
Salesforce did not immediately respond to a request for comment from Seeking Alpha.
“It’s important that companies that use copyrighted material for … AI products are transparent,” said attorney Joseph Saveri, who represents the authors and has brought similar lawsuits on behalf of copyright owners against tech companies. “It’s also only fair that our clients are fairly compensated when this happens.”
The lawsuit noted that Salesforce CEO Marc Benioff has previously criticized AI companies for using “stolen” training data to build their models and said that paying content creators for their work would be “very easy to do,” the report added.
“Benioff is right—technology companies like Benioff’s own Salesforce that use the intellectual property of copyright holders like Plaintiffs and Class members should fairly compensate them,” said the complaint.
Authors and news outlets have filed several lawsuits against tech companies, including OpenAI (OPENAI) Microsoft (MSFT) and Meta Platforms (META) for allegedly misusing their content to train AI models.
Last month, Anthropic settled a civil suit and agreed to pay authors $1.5B in a landmark copyright case involving the use of training AI models on pirated copies of books.