Home Kripto New NVIDIA AI Model Fugatto Creates Audio from Text Prompts
Kripto

New NVIDIA AI Model Fugatto Creates Audio from Text Prompts

New NVIDIA AI Model Fugatto Creates Audio from Text Prompts

NVIDIA has unveiled an experimental AI model, Fugatto, capable of generating audio from text prompts and modifying existing sound files. Officially named the Foundational Generative Audio Transformer Opus 1, the model is designed to provide a versatile solution for sound creation, described by NVIDIA as “a Swiss Army knife for sound.” Built by an international team of AI researchers, Fugatto’s capabilities extend across multiple languages and accents, enhanced by the diversity of its developers.

According to Rafael Valle, NVIDIA’s manager of applied audio research, the goal was to develop a model that approaches sound generation with human-like understanding. Fugatto enables applications ranging from rapid music prototyping to creating personalized language learning tools and dynamic audio assets for video games. For example, music producers can use the model to experiment with different voices, instruments, and styles, while game developers might customize in-game soundscapes to reflect player decisions.

NVIDIA’s Fugatto represents an exciting leap forward in generative AI, with its ability to craft complex, dynamic audio. While its practical applications remain to be tested at scale, the technology holds immense promise for creative industries, blending technical sophistication with artistic possibilities.

Beyond these use cases, the researchers discovered Fugatto could handle tasks outside its training scope. With minimal fine-tuning, the model can combine separate training instructions, such as generating emotionally expressive speech in specific accents or blending natural sounds like birdsong with the dynamic intensity of a thunderstorm. It can also produce audio that evolves over time, such as rainstorms traversing landscapes.

Despite its advanced capabilities, NVIDIA has yet to announce plans for public access to Fugatto. This development follows similar initiatives from tech giants like Meta, which introduced an open-source AI for sound creation, and Google, whose MusicLM tool generates music from text prompts via its AI Test Kitchen.

Related Articles

Pakistan Condemns Trump’s Iran Bombing a Day After Nominating Him for Peace Prize
Kripto

Pakistan Condemns Trump’s Iran Bombing a Day After Nominating Him for Peace Prize

Just hours before President Donald Trump ordered airstrikes on three nuclear facilities...

Google May Be Required to Link Users to Rival Search Platforms in the UK
Kripto

Google May Be Required to Link Users to Rival Search Platforms in the UK

The UK’s Competition and Markets Authority (CMA) is considering requiring Google to...

Trump Administration Moves to Rehire Key Federal Workers After DOGE Layoffs
Kripto

Trump Administration Moves to Rehire Key Federal Workers After DOGE Layoffs

Federal agencies are rehiring and recalling employees laid off shortly after President...

Google Adds Historical Street View Imagery to Google Earth
Kripto

Google Adds Historical Street View Imagery to Google Earth

Google announced that historical Street View imagery, previously only available on Google...