Friday, 19 Sep 2025
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • DMCA
logo logo
  • World
  • Politics
  • Crime
  • Economy
  • Tech & Science
  • Sports
  • Entertainment
  • More
    • Education
    • Celebrities
    • Culture and Arts
    • Environment
    • Health and Wellness
    • Lifestyle
  • 🔥
  • Trump
  • House
  • VIDEO
  • ScienceAlert
  • White
  • Trumps
  • Watch
  • man
  • Health
  • Season
Font ResizerAa
American FocusAmerican Focus
Search
  • World
  • Politics
  • Crime
  • Economy
  • Tech & Science
  • Sports
  • Entertainment
  • More
    • Education
    • Celebrities
    • Culture and Arts
    • Environment
    • Health and Wellness
    • Lifestyle
Follow US
© 2024 americanfocus.online – All Rights Reserved.
American Focus > Blog > Tech and Science > Meta’s AI memorised books verbatim – that could cost it billions
Tech and Science

Meta’s AI memorised books verbatim – that could cost it billions

Last updated: June 10, 2025 2:55 pm
Share
Meta’s AI memorised books verbatim – that could cost it billions
SHARE

Artificial intelligence (AI) has become a hot topic in the tech world, with billions of dollars on the line as courts in the US and UK grapple with the question of whether tech companies can legally train their AI models on copyrighted books. Authors and publishers have raised concerns, leading to multiple lawsuits being filed on this issue. In a surprising turn of events, researchers have discovered that one AI model not only used popular books in its training data but also memorized their contents verbatim.

The debate surrounding this issue revolves around whether AI developers have the legal right to use copyrighted works without obtaining permission. Previous research revealed that many large language models (LLMs) powering AI chatbots and other generative AI programs were trained on a dataset known as “Books3,” which includes nearly 200,000 copyrighted books, some of which are pirated copies. Developers argue that the AI models generate new combinations of words based on their training, transforming rather than replicating the copyrighted material.

However, recent research findings have shed light on the extent to which AI models retain the exact text of the books in their training data. While many models do not reproduce the books verbatim, it was discovered that one of Meta’s models has memorized significant portions of certain books. Should the courts rule against the company, researchers estimate that Meta could face damages of at least $1 billion.

Mark Lemley, a professor at Stanford University, emphasized that AI models do more than just learn general word relationships and are not merely “plagiarism machines.” The legal implications of AI training on copyrighted materials remain complex, with ongoing cases like Kadrey v Meta Platforms challenging the boundaries of fair use.

See also  Numerous Fossils Reveal Jurassic Fish Killed in Same, Bizarre Way : ScienceAlert

In a recent study, Lemley and his team tested AI memorization by splitting book excerpts into prefix and suffix sections to see if the models could complete the text verbatim. Excerpts from 36 copyrighted books, including popular titles like “A Game of Thrones” and “Lean In,” were used in the experiment. Results showed that Meta’s Llama 3.1 70B model had memorized significant portions of books like “Harry Potter,” “The Great Gatsby,” and “1984.”

The researchers estimated that even a 3% infringement on the Books3 dataset could lead to damages nearing $1 billion, highlighting the potential financial risks for AI developers. While this testing method offers insights into AI memorization, legal experts like Randy McCarthy caution that it does not resolve the broader question of whether companies have the right to train their AI models on copyrighted works under the US fair use rule.

In the UK, where copyright laws are stricter, the issue of AI memorization could have significant implications. Robert Lands, a lawyer at Howard Kennedy, noted that UK copyright law follows the “fair dealing” concept, providing limited exceptions to copyright infringement. Models memorizing pirated books may not qualify for this exception, raising further legal challenges in the AI landscape.

As the legal battles continue, the intersection of AI and copyright law remains a complex and evolving area that will shape the future of AI development and intellectual property rights.

TAGGED:BillionsBooksCostmemorisedMetasverbatim
Share This Article
Twitter Email Copy Link Print
Previous Article Hew Locke’s ‘Odyssey’ Flotilla Sails Through Global Colonial History and Current Affairs — Colossal Hew Locke’s ‘Odyssey’ Flotilla Sails Through Global Colonial History and Current Affairs — Colossal
Next Article Gundlach says to buy international stocks on dollar’s ‘secular decline’ Gundlach says to buy international stocks on dollar’s ‘secular decline’
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Popular Posts

What China thinks about the escalating Iran-Israel conflict

The recent tensions in the Middle East have not dampened Chinese business optimism about opportunities…

June 23, 2025

Taylor Swift Endorses Kamala Harris After Debate as ‘Childless Cat Lady’

Pop superstar Taylor Swift took to Instagram to share her thoughts on the recent presidential…

September 10, 2024

AI Scent Detection: The New Sneaker Authenticator

Imagine a world where AI-powered scent detection is used not just for sneakers, but for…

December 14, 2024

Tyla Marked Her First Coachella Performance With Vintage Y2K Style

Tyla, the South African singer known for global hits like “Water” and “Push 2 Start,”…

April 13, 2025

Google Photos Gets Useful New Feature

Google Photos, one of the world's largest cloud photo storage services, has finally introduced a…

February 1, 2025

You Might Also Like

Math puzzle: The four islands
Tech and Science

Math puzzle: The four islands

September 19, 2025
Why California’s SB 53 might provide a meaningful check on big AI companies
Tech and Science

Why California’s SB 53 might provide a meaningful check on big AI companies

September 19, 2025
Preference Falsification, Marginal Cost, and Cancel Culture
Economy

Preference Falsification, Marginal Cost, and Cancel Culture

September 19, 2025
In the West, utilities are shifting the cost of wildfires to customers
Environment

In the West, utilities are shifting the cost of wildfires to customers

September 19, 2025
logo logo
Facebook Twitter Youtube

About US


Explore global affairs, political insights, and linguistic origins. Stay informed with our comprehensive coverage of world news, politics, and Lifestyle.

Top Categories
  • Crime
  • Environment
  • Sports
  • Tech and Science
Usefull Links
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • DMCA

© 2024 americanfocus.online –  All Rights Reserved.

Welcome Back!

Sign in to your account

Lost your password?