Skip to main content

Metas new Voicebox AI is a text-to-speech tool that learns like ChatGPT

Meta claims Voicebox is the first AI that can generalize text-to-speech tasks it wasn’t trained to accomplish and describes it as a “breakthrough.”

Meta AI recently unveiled a “breakthrough” text-to-speech (TTS) generator it claims produces results up to 20 times faster than state-of-the-art artificial intelligence models with comparable performance. 

The new system, dubbed Voicebox, eschews traditional TTS architecture in favor of a model more akin to OpenAI’s ChatGPT or Google’s Bard.

Among the main differences between Voicebox and similar TTS models, such as ElevenLabs Prime Voice AI, is that Meta’s offering can generalize through in-context learning.

Much like ChatGPT or other transformer models, Voicebox uses large-scale training datasets. Previous efforts to use massive troves of audio data have resulted in severely degraded audio outputs. For this reason, most TTS systems use small, highly-curated, labelled datasets.

Meta overcomes this limitation through a novel training scheme that ditches labels and curation for an architecture capable of “in-filling” audio information.

As Meta AI put in a June 16 blog post, Voicebox is the “first model that can generalize to speech-generation tasks it was not specifically trained to accomplish with state-of-the-art performance.”

This makes it possible for Voicebox to translate text to speech, remove unwanted noise by synthesizing replacement speech, and even apply a speaker’s voice to different language outputs.

According to an accompanying research paper published by Meta, its pre-trained Voicebox system can accomplish all of this using only the desired output text and a three-second audio clip.

The arrival of robust speech-generation comes at particular sensitive time as social media companies continue to struggle with moderation and, in the U.S., a looming presidential election threatens to once again test the limits of online misinformation detection.

Former U.S. president Donald Trump, for example, currently faces allegations that he mishandled confidential government materials after leaving office. Among the purported evidence cited in the case against him are audio recordings wherein he allegedly admitted to potential wrongdoing.

While there’s currently no indication that the former president intends to deny the content described in the audio files, his case illustrates that data integrity resides at the core of the U.S. legal system and, by extension, its democracy.

Voicebox isn’t the first tool of its kind, but it appears to be among the most robust. As such, Meta’s developed a tool for determining if speech was generated by it which the company claims can “trivially detect” the difference between real and fake audio. Per the blog post:

“As with other powerful new AI innovations, we recognize that this technology brings the potential for misuse and unintended harm. In our paper, we detail how we built a highly effective classifier that can distinguish between authentic speech and audio generated with Voicebox to mitigate these possible future risks.”

In the cryptocurrency world, AI has become as integral to day-to-day operations for most businesses as the internet or electricity. The largest exchanges rely on AI chatbots for customer interactions and sentiment analysis, and trading bots have become commonplace.

Related: Bybit plugs into ChatGPT for AI-powered trading tools

The advent of robust text-to-speech systems such as Voicebox, combined with automated trading, could help bridge a gap for would-be cryptocurrency traders who rely on TTS systems that, currently, may struggle with crypto jargon or multi-lingual support.



from https://ift.tt/aXlwKd4
https://ift.tt/NCRqh3r

Comments

Popular posts from this blog

How Social Platform Chingari is Using Web 3.0 to Transform the Traditional Way We Use Social Media

The world is changing. This isn’t news to anyone, but sometimes it is nice to realize that—contrary to news headlines—not all the change is bad.  In fact, the last decade has seen so much innovation and so many improvements to technology that even 2015 seems like a different world.  Internet speeds, connecting with anyone globally (for free), and our ability to reach large groups of people without a middleman is nothing short of revolutionary. When it comes to technology evolution, this often happens with different iterations.  Once a system is mature, there’s a better idea of what we would like to change and improve.  We go back to the drawing board, target our creative minds at the issues, and create a new version that has evolved to better meet our needs.  The Internet has followed this model since its inception, evolving through three distinct stages.  We are only at the cusp of the third stage, called Web 3.0, with technologies such as blockchain and ...

ENS DAO delegates offer perspective on DAO governance and decentralized identity

AlphaWallet CEO and Spruce co-founder talk about their roles as contributors to the Ethereum Name Service following the project's recent airdrop. Earlier this month, the Ethereum Name Service, or ENS, formed a decentralized autonomous organization, or DAO, for the ENS community.  Cointelegraph spoke to two ENS DAO delegates who applied for the opportunity to represent the community and stay involved in the decision making process: Victor Zhang, CEO of AlphaWallet, an open source Ethereum wallet, and Gregory Rocco, co-founder of Spruce, a decentralized ID and data toolkit for developers. Zhang spoke about his experience as an external contributor to ENS and an early supporter since 2018. Zhang initially sought to help ENS by offering Alpha Wallet as a user-friendly tool for  resolving .eth names and cryptocurrency wallet addresses. Essentially, if a user inputs an .eth name in the AlphaWallet, it will show the wallet address, and vice versa using reverse resolution. Alpha...

Meta's head of crypto to step down at end of year

In explaining his decision to leave Meta, David Marcus said that his entrepreneurial DNA had been nudging him “for too many mornings in a row to continue ignoring.” David Marcus, the head of Meta’s cryptocurrency and fintech unit Novi, will step down from his role by the end of 2021. Taking over from Marcus will be Stephane Kasriel, the former CEO of Upwork who has been at Meta, formerly known as Facebook, since August 2020. Marcus announced the decision via a Dec. 1 tweet , noting that he had made the “difficult decision” to leave the firm by the end of this year. The exec didn’t go into detail about what his next move would be, but hinted that it may be something “new and exciting” that he builds himself: “While there’s still so much to do right on the heels of launching Novi — and I remain as passionate as ever about the need for change in our payments and financial systems — my entrepreneurial DNA has been nudging me for too many mornings in a row to continue ignoring it.” Ma...