- NATURAL 20
- Posts
- Voicebox: AI-Driven Speech Generation
Voicebox: AI-Driven Speech Generation
The groundbreaking AI model set to revolutionize the audio sphere, delivering high quality speech, unrivaled audio editing, and a seamless multilingual experience.
News Without Motives.
1440 is the daily newsletter helping 2M+ Americans stay informed—it’s news without motives, edited to be unbiased as humanly possible. The team at 1440 scours over 100+ sources so you don't have to. Culture, science, sports, politics, business, and everything in between - in a five-minute read each morning, 100% free.
Today:
Voicebox: The Most Versatile AI for Speech Generation
Voicebox, the new AI toy for all things speech generation. It creates top-notch audio clips and can even give a quick nip and tuck to pre-recorded audio. Got a pesky car horn in the background or a yapping dog? No problem, it's got your back. Plus, it's multilingual - it can churn out speech in six languages. And here's where it gets exciting: in the future, it could be the golden voice behind your virtual assistants or non-player-characters in the metaverse.
Here's a quick rundown of Voicebox's party tricks:
In-Context Text-To-Speech Synthesis: Give it a two-second audio sample and watch it whip up a storm with text-to-speech generation in the same style.
Speech Editing And Noise Reduction: It can help you give the boot to unwanted noises or misspoken words from a speech, without re-recording the whole thing. It's basically like having a magic eraser for audio editing.
Cross-Lingual Style Transfer: Give it a speech sample and a passage of text in English, French, German, Spanish, Polish, or Portuguese, and it'll read out the text in any of those languages, even if the sample speech and the text are in different languages.
Diverse Speech Sampling: Thanks to a rich diet of diverse data, Voicebox can generate speech that mimics how people talk in the real world, in six different languages.
Voicebox is a giant leap in generative AI research. Can't wait to see where this audio exploration leads us next! Curious for more? Look up Voicebox, and let it blow your mind!
Stable Diffusion gives QR codes artistic makeover
QR codes aren't just black and white squares anymore. A program called Stable Diffusion is sprucing them up - turning them into fancy artwork that'd give even Picasso a run for his money.
These new codes, ranging from styles like old-school Chinese ink brush to flashy anime, made their debut on Reddit. The user "nhciao" created these eye-catchers using a model called ControlNet. Think of ControlNet as a backstage manager - adding a few extra bells and whistles to the prompts on Stable Diffusion.
AMD looks to Hugging Face and Meta spinoff PyTorch to take on Nvidia
Advanced Micro Devices (AMD) is leaning on the open-source AI communities - specifically, Hugging Face, the so-called "GitHub for AI", and PyTorch, a Linux-based, Meta-spinoff machine-learning platform.
AMD's kickstarted this strategy at their recent developers conference, unveiling new data-center chips primed to handle AI's hefty workloads. Along with their shiny new hardware, they also debuted a duo of partnerships with Hugging Face and PyTorch, as they look to enhance their ROCm AI software stack.
Now, what's this ROCm, you ask? Think of it as AMD's open-source retort to Nvidia's tightly-guarded CUDA platform. It's sort of like a well-armed David stepping up to Goliath's highly profitable, proprietary software.
Under the hood of this partnership, AMD will tune its hardware for Hugging Face models. In plain speak, both companies will throw their engineering prowess into the pot to ensure the AI models from Hugging Face hum along smoothly on AMD hardware - no funny business.
Then there's PyTorch, a machine-learning framework originally cooked up by Meta. AMD says PyTorch will fully integrate the ROCm software stack and support AMD Instinct accelerators from day zero. AMD's aiming to court those customers flirting with a switch from Nvidia's ecosystem.
Lenovo Grows AI Infrastructure Revenue to Over US$2 Billion and Brings AI to the Data
Lenovo's stuffing big bucks into artificial intelligence (AI). They've just crossed the $2 billion mark in AI infrastructure revenue and plan to throw another $1 billion at the cause over the next three years.
With this investment, they're set to whip up a buffet of smart devices and solutions that make deploying AI easie. Basically, they're taking the AI out of the lab and putting it where the data lives, whether that's in your pocket or up in the cloud.
Lenovo's new investment includes $100 million for the Lenovo AI Innovators program, which has already dished up over 150 AI solutions. They're providing practical tools for businesses, like AI that can generate digital content, understand images and voice, and even act as a virtual assistant.
As part of this investment, Lenovo's expanding its partnership with DeepBrain AI to create AI virtual assistants for the retail and hospitality sectors. They're also collaborating with Vistry, using computer vision to help quick-service restaurants predict supply needs and reduce waste.
For the manufacturing industry, Lenovo's linking arms with Guise to reduce unplanned downtime and improve processes with predictive maintenance and anomaly detection. In the UAE, they're rolling out an AI-enabled data center in partnership with Al Hathboor Bikal.ai, which will provide AI capabilities for a range of sectors.
Google is warning its employees about AI chatbot use
Even Google is telling its own employees to button up when chatting with its artificial intelligence (AI) chatbots like Google Bard. It seems, the rule "Loose lips sink ships" still holds water in the digital age. Alphabet, Google's parent company, basically said, "Watch what you say. Those AI bots have a memory like an elephant."
Turns out, the stuff you chat about with AI chatbots isn't just floating off into cyberspace. That's because the bots are built on large language models (just a fancy term for AI systems that understand and generate human-like text). They're learning all the time from what we tell them. And guess what? The companies behind these bots hang on to that info.
As for Google Bard, the AI chatbot, Google gathers your chats, location, feedback, and how you use it. They say it's to improve and develop their products and services. But they also hold on to a selection of those chats for up to three years, reviewed by trained reviewers. Google advises to avoid including info that could identify you or others. It's like going on a reality show and expecting privacy.
Microsoft’s stock hits record after executives predict $10 billion in annual A.I. revenue
Microsoft's stock went to the moon on Thursday, breaking its previous record to close at a whopping $348.10. Why? In simple words: artificial intelligence (AI).
The folks at JPMorgan Chase reckoned that Microsoft's future was pretty rosy in the world of AI, and the market agreed. This all comes on the heels of the Federal Reserve deciding to keep interest rates steady, which put everyone in a good mood and sent a nice ripple effect through the whole market.
AI is the new black, and everyone wants a piece of it. Last year, OpenAI dropped their ChatGPT chatbot. Many tech giants have scrambled to integrate this AI into their products, talking big about cost savings in these shaky economic times.
Microsoft's deep in this game, too. They not only invested heavily in OpenAI, but also provide the tech that makes the whole thing tick. Microsoft even gets the first pick of the cherry when it comes to using OpenAI's AI models.
Microsoft's been busy as a beaver, integrating these AI tools into everything from Bing to the Windows operating system. It seems everyone's curious about what kind of magic this will do for Microsoft's bottom line.
What'd you think of today's edition? |
What are MOST interested in learning about AI?What stories or resources will be most interesting for you to hear about? |
Reply