Friday, 19 Sep 2025
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • DMCA
logo logo
  • World
  • Politics
  • Crime
  • Economy
  • Tech & Science
  • Sports
  • Entertainment
  • More
    • Education
    • Celebrities
    • Culture and Arts
    • Environment
    • Health and Wellness
    • Lifestyle
  • 🔥
  • Trump
  • House
  • VIDEO
  • ScienceAlert
  • White
  • Trumps
  • Watch
  • man
  • Health
  • Season
Font ResizerAa
American FocusAmerican Focus
Search
  • World
  • Politics
  • Crime
  • Economy
  • Tech & Science
  • Sports
  • Entertainment
  • More
    • Education
    • Celebrities
    • Culture and Arts
    • Environment
    • Health and Wellness
    • Lifestyle
Follow US
© 2024 americanfocus.online – All Rights Reserved.
American Focus > Blog > Tech and Science > Meta unleashes Llama API running 18x faster than OpenAI: Cerebras partnership delivers 2,600 tokens per second
Tech and Science

Meta unleashes Llama API running 18x faster than OpenAI: Cerebras partnership delivers 2,600 tokens per second

Last updated: April 29, 2025 2:16 pm
Share
Meta unleashes Llama API running 18x faster than OpenAI: Cerebras partnership delivers 2,600 tokens per second
SHARE

Meta has made a groundbreaking announcement today, revealing a strategic partnership with Cerebras Systems to power its new Llama API. This collaboration will provide developers with access to inference speeds that are up to 18 times faster than traditional GPU-based solutions.

The announcement was made at Meta’s inaugural LlamaCon developer conference in Menlo Park, positioning the company to directly compete with industry giants such as OpenAI, Anthropic, and Google in the rapidly expanding AI inference service market. Developers across the globe are increasingly relying on tokens to power their applications, making speed and efficiency crucial factors in the competitive landscape.

Julie Shin Choi, Chief Marketing Officer at Cerebras, expressed excitement about the partnership, stating, “Meta has chosen Cerebras to collaborate in delivering the ultra-fast inference required to serve developers through their new Llama API. This marks our first CSP hyperscaler partnership, and we are thrilled to provide ultra-fast inference to all developers.”

This partnership signifies Meta’s entry into the business of selling AI computation, transforming its widely-used open-source Llama models into a commercial service. While Meta’s Llama models have amassed over one billion downloads, the company had not previously offered a first-party cloud infrastructure for developers to leverage these models in building applications.

James Wang, a senior executive at Cerebras, highlighted the significance of Meta’s move into the AI inference business, noting the exponential growth in demand for tokens by developers building AI applications. With the introduction of the Llama API, Meta is pioneering a new revenue stream from its AI investments while maintaining a commitment to open models.

See also  Hunter-gatherers built a massive fish trap in Belize 4000 years ago

The speed advantage provided by Cerebras’ specialized AI chips is a game-changer for Meta’s offering. The Cerebras system delivers over 2,600 tokens per second for Llama 4 Scout, significantly outperforming competitors in the market. This speed increase enables the development of new applications that were previously impractical, including real-time agents, low-latency voice systems, interactive code generation, and instant multi-step reasoning.

The Llama API represents a pivotal shift in Meta’s AI strategy, transitioning from being primarily a model provider to a full-service AI infrastructure company. By offering tools for fine-tuning and evaluation, starting with the Llama 3.3 8B model, developers can generate data, train on it, and test the quality of their custom models using the API.

Cerebras will power Meta’s new service through its network of data centers located throughout North America, ensuring optimal performance and scalability. The partnership with Groq further enhances developers’ access to high-performance inference options beyond traditional GPU-based solutions.

Meta’s entry into the inference API market with superior performance metrics has the potential to disrupt the established order dominated by industry leaders like OpenAI and Google. By combining the popularity of its open-source models with faster inference capabilities, Meta is positioning itself as a formidable competitor in the commercial AI space.

The Llama API is currently available as a limited preview, with plans for a broader rollout in the coming weeks and months. Developers interested in accessing the ultra-fast Llama 4 inference can request early access by selecting Cerebras from the model options within the Llama API.

Meta’s decision to utilize specialized silicon signifies a shift towards prioritizing speed and efficiency in AI applications. In the evolving landscape of AI technology, the ability to process information quickly is becoming increasingly crucial, and Meta’s partnership with Cerebras is a significant step towards meeting this demand.

See also  Apple iPhone 17: Release Date, Price & Specs Rumours
TAGGED:18xAPICerebrasdeliversfasterLlamaMetaOpenAIpartnershiprunningtokensunleashes
Share This Article
Twitter Email Copy Link Print
Previous Article What was big at AACR? A drug, a diagnostic, and a dog. Oh and a vaccine, too. What was big at AACR? A drug, a diagnostic, and a dog. Oh and a vaccine, too.
Next Article These New Puritans and Harley Weir on Friendship, Creativity, and Their Bold New Alexander Skarsgård-Fronted Music Video These New Puritans and Harley Weir on Friendship, Creativity, and Their Bold New Alexander Skarsgård-Fronted Music Video
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Popular Posts

What Occurred at St. Peter’s Square

Mary Major Basilica is a reflection of his humility and desire to be close to…

April 26, 2025

The Problem with Government-Run Grocery Stores

In 1989, Russian President Boris Yeltsin made headlines with a memorable visit to a grocery…

September 19, 2025

In ‘I’m Listening,’ Barry McGee Celebrates Positivity in Amid Distress and Overwhelm — Colossal

Barry McGee is a San Francisco native whose art is deeply rooted in his hometown.…

April 30, 2025

Maison Margiela Sprinter trainers are quietly dominating

It’s a timeless investment, designed to outlast trends and transcend seasons. It’s the kind of…

July 10, 2025

The President and First Lady’s Message on National Read Across America Day, 2025 – The White House

National Read Across America Day: Celebrating Literacy and the Power of Reading Today, Americans from…

March 2, 2025

You Might Also Like

Math puzzle: The four islands
Tech and Science

Math puzzle: The four islands

September 19, 2025
Why California’s SB 53 might provide a meaningful check on big AI companies
Tech and Science

Why California’s SB 53 might provide a meaningful check on big AI companies

September 19, 2025
Aliens Could Eavesdrop on Our Radio Communications, NASA Study Says : ScienceAlert
Tech and Science

Aliens Could Eavesdrop on Our Radio Communications, NASA Study Says : ScienceAlert

September 19, 2025
Apple Watch Series 11: Release Date, Price & Specs
Tech and Science

Apple Watch Series 11: Release Date, Price & Specs

September 19, 2025
logo logo
Facebook Twitter Youtube

About US


Explore global affairs, political insights, and linguistic origins. Stay informed with our comprehensive coverage of world news, politics, and Lifestyle.

Top Categories
  • Crime
  • Environment
  • Sports
  • Tech and Science
Usefull Links
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • DMCA

© 2024 americanfocus.online –  All Rights Reserved.

Welcome Back!

Sign in to your account

Lost your password?