Anthropic Accused of Citing AI ‘Hallucination’ in Song Lyrics Lawsuit

CN
Decrypt
Follow
7 hours ago

An AI expert at Amazon-backed firm Anthropic has been accused of citing a fabricated academic article in a court filing meant to defend the company against claims that it trained its AI model on copyrighted song lyrics without permission.


The filing, submitted by Anthropic data scientist Olivia Chen, was part of the company’s legal response to a $75 million lawsuit filed by Universal Music Group, Concord, ABKCO, and other major publishers.


The publishers alleged in the 2023 lawsuit that Anthropic unlawfully used lyrics from hundreds of songs, including those by Beyoncé, The Rolling Stones, and The Beach Boys, to train its Claude language model.


Chen’s declaration included a citation to an article from The American Statistician, intended to support Anthropic’s argument that Claude only reproduces copyrighted lyrics under rare and specific conditions, according to a Reuters report.


During a hearing Tuesday in San Jose, the plaintiffs’ attorney Matt Oppenheim called the citation a “complete fabrication,” but said he didn’t believe Chen intentionally made it up, only that she likely used Claude itself to generate the source.


Anthropic’s attorney, Sy Damle, told the court Chen’s error appeared to be a mis-citation, not a fabrication, while criticizing the plaintiffs for raising the issue late in the proceedings.


Per Reuters, U.S. Magistrate Judge Susan van Keulen said the issue posed “a very serious and grave” concern, noting that “there’s a world of difference between a missed citation and a hallucination generated by AI.”


She declined a request to immediately question Chen, but ordered Anthropic to formally respond to the allegation by Thursday.


Anthropic did not immediately respond to Decrypt’s request for comment.


Anthropic in court


The lawsuit against Anthropic was filed in October 2023, with the plaintiffs accusing Anthropic’s Claude model of being trained on a massive volume of copyrighted lyrics and reproducing them on demand.


They demanded damages, disclosure of the training set, and the destruction of infringing content.


Anthropic responded in January 2024, denying that its systems were designed to output copyrighted lyrics.


It called any such reproduction a “rare bug” and accused the publishers of offering no evidence that typical users encountered infringing content.


In August 2024, the company was hit with another lawsuit, this time from authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, who accused Anthropic of training Claude on pirated versions of their books.


GenAI and copyright


The case is part of a growing backlash against generative AI companies accused of feeding copyrighted material into training datasets without consent.


OpenAI is facing multiple lawsuits from comedian Sarah Silverman, the Authors Guild, and The New York Times, accusing the company of using copyrighted books and articles to train its GPT models without permission or licenses.


Meta is named in similar suits, with plaintiffs alleging that its LLaMA models were trained on unlicensed literary works sourced from pirated datasets.


Meanwhile, in March, OpenAI and Google urged the Trump administration to ease copyright restrictions around AI training, calling them a barrier to innovation in their formal proposals for the upcoming U.S. “AI Action Plan.”


In the UK, a government bill that would enable artificial intelligence firms to use copyright-protected work without permission hit a roadblock this week, after the House of Lords backed an amendment requiring AI firms to reveal what copyrighted material they have used in their models.


免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

HTX:注册并领取8400元新人礼
Ad
Share To
APP

X

Telegram

Facebook

Reddit

CopyLink