Sarah Silverman is one of a group of authors suing OpenAI and Meta. Bruce Glykas/FilmMagic
OpenAI software led by Meta’s Mark Zuckerberg and Sam Altman have a new headache on their hands this week — lawsuits.
A handful of writers—including comedian Sarah Silverman—have filed class complaints against the two companies for “remixing the written works of thousands of authors—and many others—without consent, compensation, or credit.”
The plaintiffs, who include authors Paul Tremblay, Mona Awad, Christopher Golden and others, represent Joseph Savery And Matthew Patrick who say they stand on behalf of the authors to continue “a vital conversation about how artificial intelligence can coexist with human culture and creativity.”
in the lawsuit he sees luckthe exhibits show a prompt given to a large language model (such as OpenAI’s ChatGPT) asking it to explain the plot of Tremblay’s book The cabin at the end of the world.
The form gives an initial 420-word response to the “early” parts of the novel. Three more prompts prompt the next part of the book and its end yields answers totaling more than 1,100 words—including the twist reveal at the end of the novel.
Tests the rabbit And 13 ways to look at a fat girl Awad has had similar results, evidence claims.
Looking at the extensive answers provided to the plaintiffs they claim their work was fed to large language models as training data for reproduction “without consent, without credit, and without compensation”.
Meta and OpenAI did not immediately respond when contacted luck to comment.
Why books?
Silverman The lawsuit was filed on Friday– together with the authors Chris GoldenAnd Richard Cadre– comes after an initial deposit on June 28th from a group of other authors.
Lawyers for the plaintiffs claim that the authors have been in constant contact with them since March 2023 when OpenAI’s ChatGPT was first made available to the public.
The case alleges that the books are a particularly valuable training suite for the LLM.
The lawyers cite a study from MIT, Cornell University, and Google research Posted in May Which found the best domains used training data from “high quality” sources such as books as well as data from the web.
Although the books were found to contain significant amounts of toxicants, the study also found that as a data set they provided the “longest and most readable” high-quality text.
The suit claims that this valuable data was obtained from a “blatantly illegal” source called the “Shadow Library”. These are online databases that collect books and articles, thus bypassing hurdles such as paywalls and payments for downloading copies.
The first lawsuit of many?
Experts expect the latest lawsuit to be one of many. said Daniel Gervais, a professor of law at Vanderbilt University Early last week An avalanche of lawsuits from creators is imminent.
“this [the author lawsuit] Jervis said, addressing the lawsuit’s allegations about AI data collection and training. “The production wave is also coming.”
Discontent in the literature industry follows upheaval in the entertainment sector with Hollywood writers Currently on strike due to concerns that artificial intelligence will undermine their profession by mimicking existing scripts.
It’s an issue that Sam Altman, at least, understands needs to be addressed.
Speak to the Senate Subcommittee hearing on May 16“We believe that creators deserve control over how their creativity is used, and what happens kind of goes beyond the point of releasing it into the world,” Altman said.
“We need to discover new ways with this new technology that creators can win, be successful, and have a vibrant life, and I’m optimistic this will advance that.”
The authors’ lawsuit seeks unspecified monetary damages—one avenue Altman says he’s been discussing with artists and musicians to “find out what people want.” “There are a lot of different opinions, unfortunately,” he added.
“Reader. Infuriatingly humble coffee enthusiast. Future teen idol. Tv nerd. Explorer. Organizer. Twitter aficionado. Evil music fanatic.”