Sunday, 13 Jul 2025
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • DMCA
logo logo
  • World
  • Politics
  • Crime
  • Economy
  • Tech & Science
  • Sports
  • Entertainment
  • More
    • Education
    • Celebrities
    • Culture and Arts
    • Environment
    • Health and Wellness
    • Lifestyle
  • 🔥
  • Trump
  • House
  • VIDEO
  • ScienceAlert
  • White
  • Watch
  • Trumps
  • man
  • Health
  • Day
Font ResizerAa
American FocusAmerican Focus
Search
  • World
  • Politics
  • Crime
  • Economy
  • Tech & Science
  • Sports
  • Entertainment
  • More
    • Education
    • Celebrities
    • Culture and Arts
    • Environment
    • Health and Wellness
    • Lifestyle
Follow US
© 2024 americanfocus.online – All Rights Reserved.
American Focus > Blog > Tech and Science > Meta’s AI memorised books verbatim – that could cost it billions
Tech and Science

Meta’s AI memorised books verbatim – that could cost it billions

Last updated: June 10, 2025 2:55 pm
Share
Meta’s AI memorised books verbatim – that could cost it billions
SHARE

Artificial intelligence (AI) has become a hot topic in the tech world, with billions of dollars on the line as courts in the US and UK grapple with the question of whether tech companies can legally train their AI models on copyrighted books. Authors and publishers have raised concerns, leading to multiple lawsuits being filed on this issue. In a surprising turn of events, researchers have discovered that one AI model not only used popular books in its training data but also memorized their contents verbatim.

The debate surrounding this issue revolves around whether AI developers have the legal right to use copyrighted works without obtaining permission. Previous research revealed that many large language models (LLMs) powering AI chatbots and other generative AI programs were trained on a dataset known as “Books3,” which includes nearly 200,000 copyrighted books, some of which are pirated copies. Developers argue that the AI models generate new combinations of words based on their training, transforming rather than replicating the copyrighted material.

However, recent research findings have shed light on the extent to which AI models retain the exact text of the books in their training data. While many models do not reproduce the books verbatim, it was discovered that one of Meta’s models has memorized significant portions of certain books. Should the courts rule against the company, researchers estimate that Meta could face damages of at least $1 billion.

Mark Lemley, a professor at Stanford University, emphasized that AI models do more than just learn general word relationships and are not merely “plagiarism machines.” The legal implications of AI training on copyrighted materials remain complex, with ongoing cases like Kadrey v Meta Platforms challenging the boundaries of fair use.

See also  Court denies Apple’s request to pause ruling on App Store payment fees

In a recent study, Lemley and his team tested AI memorization by splitting book excerpts into prefix and suffix sections to see if the models could complete the text verbatim. Excerpts from 36 copyrighted books, including popular titles like “A Game of Thrones” and “Lean In,” were used in the experiment. Results showed that Meta’s Llama 3.1 70B model had memorized significant portions of books like “Harry Potter,” “The Great Gatsby,” and “1984.”

The researchers estimated that even a 3% infringement on the Books3 dataset could lead to damages nearing $1 billion, highlighting the potential financial risks for AI developers. While this testing method offers insights into AI memorization, legal experts like Randy McCarthy caution that it does not resolve the broader question of whether companies have the right to train their AI models on copyrighted works under the US fair use rule.

In the UK, where copyright laws are stricter, the issue of AI memorization could have significant implications. Robert Lands, a lawyer at Howard Kennedy, noted that UK copyright law follows the “fair dealing” concept, providing limited exceptions to copyright infringement. Models memorizing pirated books may not qualify for this exception, raising further legal challenges in the AI landscape.

As the legal battles continue, the intersection of AI and copyright law remains a complex and evolving area that will shape the future of AI development and intellectual property rights.

TAGGED:BillionsBooksCostmemorisedMetasverbatim
Share This Article
Twitter Email Copy Link Print
Previous Article Hew Locke’s ‘Odyssey’ Flotilla Sails Through Global Colonial History and Current Affairs — Colossal Hew Locke’s ‘Odyssey’ Flotilla Sails Through Global Colonial History and Current Affairs — Colossal
Next Article Gundlach says to buy international stocks on dollar’s ‘secular decline’ Gundlach says to buy international stocks on dollar’s ‘secular decline’
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Popular Posts

‘Love Island’ Craze Boosts Spotify, Theme Music and Playlists See Big Spikes

'Love Island' Summer Craze Boosts Spotify ... Fans Just Can't Get Enough Published July 10,…

July 10, 2025

Greg Gutfeld Slams Media and Comedians for Covering for Biden: ‘They Can’t Make Us Forget How Many Times They Lied’ (VIDEO) |

Greg Gutfeld Returns to FOX News After Extended Break After taking a break due to…

January 12, 2025

Trump’s Harvard Visa Threat Could Wipe Out School’s Sports Teams

A recent decision by the Trump administration could have a devastating impact on Harvard University's…

May 23, 2025

Summer House’s Imrul Hassan Calls Being Edited Out ‘Disrespectful’

Imrul Hassan, a former cast member of the reality show "Summer House," recently revealed his…

July 1, 2025

Colorado joins lawsuit over Trump freezing EV charging funding

Colorado has teamed up with 16 other states to take legal action against the Trump…

May 8, 2025

You Might Also Like

Big City Lights Could Be Damaging Your Heart Health : ScienceAlert
Tech and Science

Big City Lights Could Be Damaging Your Heart Health : ScienceAlert

July 13, 2025
CISO dodges bullet protecting .8 trillion from shadow AI
Tech and Science

CISO dodges bullet protecting $8.8 trillion from shadow AI

July 13, 2025
Ancient Tooth Proteins Rewrite the Rhino Family Tree—Are Dinosaurs Next?
Tech and Science

Ancient Tooth Proteins Rewrite the Rhino Family Tree—Are Dinosaurs Next?

July 13, 2025
Astronomers found a completely new type of plasma wave near Jupiter
Tech and Science

Astronomers found a completely new type of plasma wave near Jupiter

July 12, 2025
logo logo
Facebook Twitter Youtube

About US


Explore global affairs, political insights, and linguistic origins. Stay informed with our comprehensive coverage of world news, politics, and Lifestyle.

Top Categories
  • Crime
  • Environment
  • Sports
  • Tech and Science
Usefull Links
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • DMCA

© 2024 americanfocus.online –  All Rights Reserved.

Welcome Back!

Sign in to your account

Lost your password?