ElevenLabs moves beyond speech with gen AI Sound Effects

7 Min Read

Time’s nearly up! There’s just one week left to request an invitation to The AI Affect Tour on June fifth. Do not miss out on this unimaginable alternative to discover varied strategies for auditing AI fashions. Discover out how one can attend right here.


After launching instruments for text-to-speech and speech-to-speech synthesis, AI voice startup ElevenLabs is shifting to the subsequent goal. The 2-year-old startup based by former Google and Palantir workers at the moment introduced the launch of a brand new text-to-sound AI providing referred to as Sound Results.

Obtainable beginning at the moment on the ElevenLabs web site, Sound Results makes use of the startup’s in-house basis mannequin and permits creators to generate various kinds of audio samples by merely typing an outline of their imagined sound.

The corporate first teased the device in February with a publish that includes Sora-generated clips, albeit enhanced with AI sound results.

ElevenLabs partnered with Shutterstock to carry this product to life and expects to see adoption from creators throughout domains who wish to improve their content material with immersive soundscapes.

What to anticipate from ElevenLabs Sound Results?

At present, when creators wish to add ambient noises to their content material — equivalent to social movies, video games, films and TV exhibits — the should both manually document them or purchase/license audio information from completely different repositories on the web.

The strategy works, however chances are you’ll not at all times discover the audio you’re on the lookout for from these sources, or have the funds to pay to document a brand new sound.

See also  Time Magazine partners with OpenAI and ElevenLabs

ElevenLabs’ new Sound Results device adjustments that, giving creators and manufacturing groups a solution to get precisely what they need by merely typing it in plain, conversational English.

When a consumer enters a textual content immediate detailing the sound impact they’re on the lookout for, the mannequin powering Sound Results processes it and generates six distinctive audio samples to select from.

The consumer can then pay attention to every of those and choose what works finest for his or her mission by downloading or storing it instantly on ElevenLabs’ platform. 

VentureBeat received early entry to the providing and located it was in a position to generate clear outputs in about 30-40 seconds. Nevertheless, in our exams, Sound Results generated simply 4 choices, not six.

This included a variety of audio samples, protecting commonplace ambient noises equivalent to thunderstorms, doorbells and cash jingling to extra advanced ones like monkeys chattering, vehicles racing, individuals consuming at a diner or a prepare coming to a halt.

Mati Staniszewski, CEO of ElevenLabs, instructed VentureBeat the device may transcend a few-second-long sounds to supply longer audio samples equivalent to instrumental music and character voices.

“It might probably generate instrumental music tracks as much as 22 seconds with prompts like guitar loop, jazz saxophone solo, and music techno loop,” Staniszewski defined. “The mannequin may create quite a lot of character voices utilizing prompts like ‘lady singing dancing within the sand, we watched the daylight finish’ or ‘an ogre saying ‘keep away puny human’. You possibly can even chain collectively sounds with prompts like ‘A joyful aged lady says I’m so happy with you after which laughs.’”

See also  Why we need to check the gen AI hype and get back to reality

Whereas the corporate has not shared specifics of the mannequin powering these capabilities, it did be aware that it’s based mostly on in-house analysis of the corporate and has been fine-tuned on Shutterstock’s audio library of licensed tracks. 

“The mixed energy of our wealthy and immersive library of tracks and this cutting-edge audio expertise has enabled the creation of a real market first. We’re thrilled by the constructive suggestions from the early entry group and sit up for seeing the big selection of initiatives they may create,” Aimee Egan, Chief Enterprise Officer at Shutterstock, mentioned in a press release.

Objective to energy creators worldwide

Since its inception two years in the past, ElevenLabs has targeted on creating and launching highly effective AI audio capabilities.

The corporate first launched fashions for text-to-speech in numerous languages after which adopted it up with a voice cloning product and AI Dubbing, a speech-to-speech conversion device that allowed customers to translate audio and video into 29 completely different languages while preserving the unique speaker’s voice and feelings.

With the launch of Sound Results at the moment, it’s extending this work, equipping creators with extra instruments to supply high-quality content material.

Staniszewski hopes creators throughout domains will be capable to use Sound Results, together with movie and tv studios, online game builders, entrepreneurs and social media content material creators.

Nevertheless, he didn’t share the names of the enterprises which have been alpha-testing the product to date. 

Again in January, the corporate mentioned it counts 41% of the Fortune 500 amongst its prospects, together with huge names equivalent to The Washington Publish, Storytel and TheSoul Publishing.

See also  Gen AI's impact on healthcare: Cutting-edge applications (and their challenges)

As the subsequent step, Staniszewski added, the corporate will even launch a music era mannequin in addition to a voiceover studio providing, which is at present in alpha. The timeline for each stays unclear at this stage.

Different corporations within the AI speech, sound and music era area are Google, Meta, Suno, Pika, MURF.AIPlay.ht and WellSaid Labs. In response to Market US, the worldwide marketplace for such instruments stood at $1.2 billion in 2022 and is estimated to the touch almost $5 billion in 2032, with a CAGR of barely above 15.40%.

Source link

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Please enter CoinGecko Free Api Key to get this plugin works.