Go to contents

Unauthorized use of literary work for big-tech AI training

Unauthorized use of literary work for big-tech AI training

Posted August. 22, 2023 08:45,   

Updated August. 22, 2023 08:45

한국어

Steven King, Murakami Haruki, Zadie Smith, and Michael Pollan are among the authors whose works are included in LLaMa (Large Language Model Meta AI), the basis of generative AI developed by Meta of Facebook. The Atlantic reported on Saturday (local time) that the works of these authors had been used without permission. Some U.S. authors filed against open AI earlier because their books had been used for chat GPT training without consent. The recent news confirmed‎s that massive volumes of copyrighted content had been used without authorization.

Analysis by The Atlantic on Books 3 used in LLaMa learning showed that more than 170,000 books published over the last 20 years had been included, including 30,000 copies by Penguin Random House, 14,000 and 7,000 copies by Harper Collins and MacMillan, respectively, 1,800 books by Oxford University. A third was fiction, and the rest were non-fiction. “The future promised by AI is written in stolen words,” reported the Atlantic.

Books 3 was used for training for chat GPT and Bloomberg GPT, a generative AI service launched by Bloomberg. The Atlantic explained that access to Books 3, which had been widely popular in AI communities, had grown limited as class actions on open AI started in June this year. Big tech companies, however, claim that generative AI does not simply copy trained books but creates new work.


71wook@donga.com