Saturday, 20 Sep 2025
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • DMCA
logo logo
  • World
  • Politics
  • Crime
  • Economy
  • Tech & Science
  • Sports
  • Entertainment
  • More
    • Education
    • Celebrities
    • Culture and Arts
    • Environment
    • Health and Wellness
    • Lifestyle
  • 🔥
  • Trump
  • House
  • VIDEO
  • White
  • ScienceAlert
  • Trumps
  • Watch
  • man
  • Health
  • Season
Font ResizerAa
American FocusAmerican Focus
Search
  • World
  • Politics
  • Crime
  • Economy
  • Tech & Science
  • Sports
  • Entertainment
  • More
    • Education
    • Celebrities
    • Culture and Arts
    • Environment
    • Health and Wellness
    • Lifestyle
Follow US
© 2024 americanfocus.online – All Rights Reserved.
American Focus > Blog > Tech and Science > Open source LLMs hit Europe’s digital sovereignty roadmap
Tech and Science

Open source LLMs hit Europe’s digital sovereignty roadmap

Last updated: February 16, 2025 7:20 am
Share
Open source LLMs hit Europe’s digital sovereignty roadmap
SHARE

Large language models (LLMs) have recently taken center stage on Europe’s digital sovereignty agenda with the launch of a new program called OpenEuroLLM. This initiative aims to develop a series of open source LLMs covering all European Union languages, including the official 24 EU languages and languages for countries seeking entry into the EU market like Albania. The project, co-led by Jan Hajič and Peter Sarlin, involves collaboration between 20 organizations and is part of Europe’s broader push for digital sovereignty.

The OpenEuroLLM project has a budget of €37.4 million, with funding coming from the EU’s Digital Europe Programme. Partners include EuroHPC supercomputer centers in several European countries. Despite the ambitious goal of developing multilingual LLMs, some have raised concerns about the project’s feasibility due to the involvement of multiple organizations with different priorities.

Jan Hajič, who is also coordinating the High Performance Language Technologies (HPLT) project, sees OpenEuroLLM as a continuation of HPLT with a focus on generative LLMs. The project aims to release the first versions by mid-2026 and final iterations by 2028. While starting from scratch in terms of data and tools, the project benefits from the expertise of its partners.

Participating organizations include academic and research institutions from Czechia, the Netherlands, Germany, Sweden, Finland, and Norway, as well as corporate entities like Silo AI, Aleph Alpha, Ellamind, Prompsit Language Engineering, and LightOn. Notably absent from the list is Mistral, a French AI company known for its open source approach. While efforts were made to involve Mistral in the project, discussions did not progress.

See also  Ark. homes devastated by deadly tornados also hit by twisters last year

The project’s ultimate goal is to create foundation models for transparent AI in Europe that preserve linguistic and cultural diversity. This includes developing a core multilingual LLM for general-purpose tasks and smaller, more efficient versions for edge applications. Detailed plans for the project’s deliverables are still in development, with a focus on balancing size and quality. The OpenEuroLLM project is striving to create a large language model that is proficient in all languages, with a particular focus on ensuring equality across the board. However, achieving this goal may be challenging, especially for languages with limited digital resources. To address this, the project is working on establishing true benchmarks that are representative of each language and its cultural nuances.

One of the key components of the project is the data it utilizes. The HPLT project has released version 2.0 of its dataset, which includes 4.5 petabytes of web crawls and over 20 billion documents. Additionally, data from Common Crawl, an open repository of web-crawled data, will be incorporated into the mix to further enhance the model’s training.

In the realm of open source AI, there has been a debate about what constitutes true openness. While the Open Source Initiative has defined open source AI, there are differing opinions on whether training data should be included in the definition. The OpenEuroLLM project aims to be as open as possible, but certain limitations may require them to keep some training data confidential, although it will be accessible for auditing purposes as per EU regulations.

Despite its commitment to openness, the OpenEuroLLM project has faced criticism for similarities to the EuroLLM project, which launched earlier in Europe with EU funding. The two projects share common goals of creating open source language models for European languages, but due to funding restrictions, collaborations between the two may be limited.

See also  Hochul scrambles to fund NYPD’s $154M overnight subway staffing, as cops hit NYC trains after rash of violence

In terms of funding, the OpenEuroLLM project is confident that it will have sufficient resources to support its goals. Partnering with EuroHPC centers, which have invested billions in AI and compute infrastructure, will provide the necessary funding for the project. The focus of the project is on building foundational models rather than consumer or enterprise-grade products, which helps streamline the budget allocation.

Overall, the OpenEuroLLM project is dedicated to creating a high-quality, open source language model that can serve as a foundational AI infrastructure for companies in Europe. With a strong focus on data quality, cultural representation, and collaboration within the EU, the project aims to make significant strides in the field of AI language models. The upcoming Europa models from OpenEuroLLM are set to revolutionize language processing in Europe. These new models will support all European languages, building upon the foundation laid by the current models that already cover a handful of European languages. This advancement aligns with the vision of not starting from scratch, as emphasized by Hajič, who recognizes the existing expertise and technology in place.

Critics have pointed out the complexity of OpenEuroLLM, but Hajič views this as a positive aspect. He believes that collaborative projects, leveraging both academic expertise and industry focus, can bring about innovative solutions. The goal is not to compete with Big Tech or billion-dollar AI startups, but to achieve digital sovereignty through (mostly) open foundation LLMs developed by and for Europe.

Hajič emphasizes the importance of having a European-based model, even if it may not be the top performer globally. The focus is on creating a model that encompasses all necessary components within Europe, ensuring a positive outcome regardless of rankings. This commitment to digital sovereignty sets OpenEuroLLM apart and paves the way for a new era of language processing technology in Europe.

See also  Who Gets Your 'Digital Remains' When You Die? Here's Some Expert Advice. : ScienceAlert
TAGGED:DigitalEuropeshitLLMsOpenroadmapsourcesovereignty
Share This Article
Twitter Email Copy Link Print
Previous Article ‘Brand New Life’ and ‘Man From Nowhere’ Star Was 24 ‘Brand New Life’ and ‘Man From Nowhere’ Star Was 24
Next Article Trump is shocking official Washington. Will he leave his mark on the District too? Trump is shocking official Washington. Will he leave his mark on the District too?
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Popular Posts

Man found fatally shot in Edgewater accidentally killed himself, medical examiner says

A tragic incident unfolded in Edgewater last month when a man was found shot near…

May 1, 2025

100 Music Jokes Your Students Will Love

What do you call a group of musical whales? An orca-stra. 18. Why was the…

November 18, 2024

Cassie Says Diddy Overdosed On Opiates

Cassie Diddy Overdosed On Painkillers Published May 15, 2025 1:22 PM PDT Diddy overdosed on…

May 15, 2025

Woman in court in Hamptons hit-and-run death of Sara Burack

Virginia Woman Arrested in Fatal Hit-and-Run Death of Hamptons Real Estate Agent Sara Burack A…

June 21, 2025

Venezuelan Little League team denied travel visas to US for World Series

A team of Little League baseball players from Venezuela has unfortunately been denied visas to…

July 25, 2025

You Might Also Like

How to Weigh a Black Hole
Tech and Science

How to Weigh a Black Hole

September 20, 2025
Warning: You Should Never Send These Images Via WhatsApp
Tech and Science

Warning: You Should Never Send These Images Via WhatsApp

September 20, 2025
Where you store fat may influence the effect it has on your brain
Tech and Science

Where you store fat may influence the effect it has on your brain

September 20, 2025
Nvidia eyes 0M investment into self-driving tech startup Wayve
Tech and Science

Nvidia eyes $500M investment into self-driving tech startup Wayve

September 20, 2025
logo logo
Facebook Twitter Youtube

About US


Explore global affairs, political insights, and linguistic origins. Stay informed with our comprehensive coverage of world news, politics, and Lifestyle.

Top Categories
  • Crime
  • Environment
  • Sports
  • Tech and Science
Usefull Links
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • DMCA

© 2024 americanfocus.online –  All Rights Reserved.

Welcome Back!

Sign in to your account

Lost your password?