Sunday, 16 Jun 2024

New AI tool is too powerful to be released to the public, says Meta

Facebook-owner Meta has created what it says is the most powerful AI voice generator yet, but is holding back from releasing it to the public.

Dubbed ‘Voicebox’, the tech giant claims that it’s a breakthrough in generative AI for speech, going beyond what it was ‘specifically trained to accomplish’. 

Just as ChatGPT and Bard can generate text based on prompts, Voicebox can create audio outputs from scratch, or modify a given sample in a variety of styles.

Voicebox is currently able to produce audio clips of speech in six languages, according to a blog post from Meta. It is also touted to outperform existing tools.

However, the company said it was not making the Voicebox model or code publicly available at this time due to ‘potential risks of misuse’.

‘While we believe it is important to be open with the AI community and to share our research to advance the state of the art in AI, it’s also necessary to strike the right balance between openness with responsibility,’ said Meta.

For now, it has released audio samples and a research paper detailing the approach behind Voicebox and the results achieved.

Meta added that it had built a ‘highly effective’ feature that can distinguish between authentic speech and audio generated with Voicebox.

Voicebox has been trained on over 50,000 hours of recorded speech and transcripts from publicly available audiobooks in English, French, Spanish, German, Polish, and Portuguese.

Just last week, Meta, released a new AI model that can create images ‘indistinguishable’ from human-made ones.

The company’s executives have previously dismissed warnings from others in the industry about the potential dangers of the technology, declining to sign a statement last month backed by top executives from OpenAI, DeepMind, Microsoft and Google that equated its risks with pandemics and wars.

Meta is also starting to incorporate generative AI features into its consumer products, like ad tools that can create image backgrounds and an Instagram product that can modify user photos, both based on text prompts.

Voicebox features include:

  • From just two seconds of audio it can match the style and use it for text-to-speech generation, potentially useful for bringing speech to people who are unable to speak. You could also use it to customise the voices of your virtual assistants.
  • With a speech sample and a text passage in English, French, German, Spanish, Polish or Portuguese, Voicebox can produce a reading of the text in that language. This could be used to communicate when people speak different languages.
  • Editing segments within audio recordings interrupted by noise, or replacing misspoken words without having to rerecord the entire speech.

Source: Read Full Article

Related Posts