Friday, 31 Oct 2025
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • DMCA
logo logo
  • World
  • Politics
  • Crime
  • Economy
  • Tech & Science
  • Sports
  • Entertainment
  • More
    • Education
    • Celebrities
    • Culture and Arts
    • Environment
    • Health and Wellness
    • Lifestyle
  • 🔥
  • Trump
  • VIDEO
  • House
  • White
  • ScienceAlert
  • Trumps
  • Watch
  • man
  • Health
  • Season
Font ResizerAa
American FocusAmerican Focus
Search
  • World
  • Politics
  • Crime
  • Economy
  • Tech & Science
  • Sports
  • Entertainment
  • More
    • Education
    • Celebrities
    • Culture and Arts
    • Environment
    • Health and Wellness
    • Lifestyle
Follow US
© 2024 americanfocus.online – All Rights Reserved.
American Focus > Blog > Tech and Science > Meta’s AI memorised books verbatim – that could cost it billions
Tech and Science

Meta’s AI memorised books verbatim – that could cost it billions

Last updated: June 10, 2025 2:55 pm
Share
Meta’s AI memorised books verbatim – that could cost it billions
SHARE

Artificial intelligence (AI) has become a hot topic in the tech world, with billions of dollars on the line as courts in the US and UK grapple with the question of whether tech companies can legally train their AI models on copyrighted books. Authors and publishers have raised concerns, leading to multiple lawsuits being filed on this issue. In a surprising turn of events, researchers have discovered that one AI model not only used popular books in its training data but also memorized their contents verbatim.

The debate surrounding this issue revolves around whether AI developers have the legal right to use copyrighted works without obtaining permission. Previous research revealed that many large language models (LLMs) powering AI chatbots and other generative AI programs were trained on a dataset known as “Books3,” which includes nearly 200,000 copyrighted books, some of which are pirated copies. Developers argue that the AI models generate new combinations of words based on their training, transforming rather than replicating the copyrighted material.

However, recent research findings have shed light on the extent to which AI models retain the exact text of the books in their training data. While many models do not reproduce the books verbatim, it was discovered that one of Meta’s models has memorized significant portions of certain books. Should the courts rule against the company, researchers estimate that Meta could face damages of at least $1 billion.

Mark Lemley, a professor at Stanford University, emphasized that AI models do more than just learn general word relationships and are not merely “plagiarism machines.” The legal implications of AI training on copyrighted materials remain complex, with ongoing cases like Kadrey v Meta Platforms challenging the boundaries of fair use.

See also  11 unforgettable scenes from the Wildlife Photographer of the Year awards

In a recent study, Lemley and his team tested AI memorization by splitting book excerpts into prefix and suffix sections to see if the models could complete the text verbatim. Excerpts from 36 copyrighted books, including popular titles like “A Game of Thrones” and “Lean In,” were used in the experiment. Results showed that Meta’s Llama 3.1 70B model had memorized significant portions of books like “Harry Potter,” “The Great Gatsby,” and “1984.”

The researchers estimated that even a 3% infringement on the Books3 dataset could lead to damages nearing $1 billion, highlighting the potential financial risks for AI developers. While this testing method offers insights into AI memorization, legal experts like Randy McCarthy caution that it does not resolve the broader question of whether companies have the right to train their AI models on copyrighted works under the US fair use rule.

In the UK, where copyright laws are stricter, the issue of AI memorization could have significant implications. Robert Lands, a lawyer at Howard Kennedy, noted that UK copyright law follows the “fair dealing” concept, providing limited exceptions to copyright infringement. Models memorizing pirated books may not qualify for this exception, raising further legal challenges in the AI landscape.

As the legal battles continue, the intersection of AI and copyright law remains a complex and evolving area that will shape the future of AI development and intellectual property rights.

TAGGED:BillionsBooksCostmemorisedMetasverbatim
Share This Article
Twitter Email Copy Link Print
Previous Article Hew Locke’s ‘Odyssey’ Flotilla Sails Through Global Colonial History and Current Affairs — Colossal Hew Locke’s ‘Odyssey’ Flotilla Sails Through Global Colonial History and Current Affairs — Colossal
Next Article Gundlach says to buy international stocks on dollar’s ‘secular decline’ Gundlach says to buy international stocks on dollar’s ‘secular decline’
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Popular Posts

Best Dressed Men At The GQ Men Of The Year 2024 Party + More

Last week, Black male celebrities made a bold statement in the world of fashion, showcasing…

November 18, 2024

Victory Day for World War II, 2025 – The White House

BY THE PRESIDENT OF THE UNITED STATES OF AMERICA A PROCLAMATION Today, we take a…

May 7, 2025

One Major Feature of Aging Might Not Be Universal After All : ScienceAlert

Ageing Isn't the Same Everywhere: How Inflammation May Be a Lifestyle Problem For years, scientists…

July 7, 2025

France rushes supplies to cyclone-devastated Mayotte

France has increased its relief efforts for Mayotte, an overseas territory devastated by a cyclone.…

December 18, 2024

Colgate-Palmolive (CL): A Steady Dividend Payer in the S&P 500

Colgate-Palmolive Company (NYSE:CL) is featured in the 12 Best Dividend Aristocrat Stocks to Invest in Right…

October 3, 2025

You Might Also Like

AI mania tanks CoreWeave’s Core Scientific acquisition; it buys Python notebook Marimo
Tech and Science

AI mania tanks CoreWeave’s Core Scientific acquisition; it buys Python notebook Marimo

October 31, 2025
How Supermassive Black Holes Can Become Cosmic Nightmares
Tech and Science

How Supermassive Black Holes Can Become Cosmic Nightmares

October 31, 2025
Why identity-first security is the first defense against sophisticated AI-powered social engineering
Tech and Science

Why identity-first security is the first defense against sophisticated AI-powered social engineering

October 31, 2025
Your flight emissions are way higher than carbon calculators suggest
Tech and Science

Your flight emissions are way higher than carbon calculators suggest

October 31, 2025
logo logo
Facebook Twitter Youtube

About US


Explore global affairs, political insights, and linguistic origins. Stay informed with our comprehensive coverage of world news, politics, and Lifestyle.

Top Categories
  • Crime
  • Environment
  • Sports
  • Tech and Science
Usefull Links
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • DMCA

© 2024 americanfocus.online –  All Rights Reserved.

Welcome Back!

Sign in to your account

Lost your password?