Two undergrads built an AI speech model to rival NotebookLM

Introducing Dia: Nari Labs’ New AI Model for Generating Podcast-Style Clips

The world of synthetic speech tools is rapidly expanding, with numerous players entering the market to meet the growing demand. One such player is Nari Labs, a Korea-based startup co-founded by Toby Kim. Despite lacking extensive AI expertise, Kim and his fellow co-founder have developed an AI model named Dia that is now openly available for use.

Inspired by Google’s NotebookLM, Kim and his team set out to create a model that offered more control over generated voices and greater flexibility in script customization. After just three months of learning about speech AI, they utilized Google’s TPU Research Cloud program to train Dia, which boasts an impressive 1.6 billion parameters.

Parameters are crucial internal variables that models use to make predictions, and Dia’s large parameter count ensures high performance. Available on platforms like Hugging Face and GitHub, Dia can run on most modern PCs with at least 10GB of VRAM. It allows users to generate dialogue from scripts, customize speakers’ tones, and even insert nonverbal cues like coughs and laughs.

In a brief test conducted by JS, Dia performed admirably, generating realistic two-way conversations on various topics. The quality of the voices produced by Dia rivals that of other tools on the market, and its voice cloning feature is notably user-friendly.

However, like many voice generators, Dia lacks robust safeguards against misuse. Nari Labs warns against using the model for impersonation or deceptive purposes but disclaims responsibility for any misuse. Additionally, the source of the data used to train Dia remains undisclosed, raising questions about potential copyright infringement.

Despite these concerns, Nari Labs has ambitious plans for Dia, aiming to develop a synthetic voice platform with a social aspect and expand language support beyond English. Kim envisions a future where Dia and its successors revolutionize the way we interact with AI-generated voices.

Two undergrads built an AI speech model to rival NotebookLM

Introducing Dia: Nari Labs’ New AI Model for Generating Podcast-Style Clips

Leave a Reply Cancel reply

Popular Posts

Worker charged with murdering two men inside fast food restaurant, Chicago police say

We Earthlings: 2024 Can By Your Turning Point

Beta Sells to SBS ‘La Storia,’ ‘Operation Sabre,’ ’30 Days of Lust’

90 Work-Life Balance Quotes for a Happy and Less Stressful Life

Liverpool interested in signing 24-year-old ex-Manchester United star: Reports

About US

Top Categories

Usefull Links