NATURAL 20
Posts
OpenAI Unveils New Voice Feature

OpenAI Unveils New Voice Feature

PLUS: Meta's AI Segments Anything, Midjourney's Major 6.1 Upgrade and more.

Wes Roth
August 02, 2024

In partnership with

SUBSCRIBE | JOIN AI FORUM | LEARN AI

Learn AI Strategies worth a Million Dollar in this 3 hour AI Workshop. Join now for $0

Everyone tells you to learn AI but no one tells you where.

We have partnered with GrowthSchool to bring this ChatGTP & AI Workshop to our readers. It is usually $199, but free for you because you are our loyal readers 🎁

This workshop has been taken by 1 Million people across the globe, who have been able to:

Build business that make $10,000 by just using AI tools
Make quick & smarter decisions using AI-led data insights
Write emails, content & more in seconds using AI
Solve complex problems, research 10x faster & save 16 hours every week

You’ll wish you knew about this FREE AI Training sooner (Btw, it’s rated at 9.8/10 ⭐)

Save your seat for $0 now! (Valid for 100 people only)

Today:

OpenAI Unveils New Voice Feature
Google's New Gemma AI Model
OpenAI Expands GPT-4o Output
Meta's AI Segments Anything
Midjourney's Major 6.1 Upgrade
Runway's Gen-3 Alpha Turbo Launch
Amazon Unveils High-Performance AI Chip

OpenAI starts rolling out its Her-like voice mode for ChatGPT

OpenAI is rolling out an advanced voice mode for ChatGPT to select ChatGPT Plus users. This new mode, demonstrated at their GPT-4o event, allows for interactive storytelling. The release was delayed to enhance safety features, including blocking certain content and copyrighted audio.

The new mode will use four preset voices created with voice actors, avoiding impersonations. Criticized for resembling Scarlett Johansson's voice, the mode now ensures it won't mimic real people. OpenAI aims to make this feature available to all ChatGPT Plus users in the fall.

THE VERGE

Smaller, Safer, More Transparent: Advancing Responsible AI with Gemma

Google has introduced Gemma 2, a new AI model in 27B and 9B parameter sizes, emphasizing safety and accessibility. Gemma 2 2B, a smaller version, offers exceptional performance and can run on various devices. ShieldGemma, a set of safety classifiers, filters harmful content in AI inputs and outputs, enhancing user safety.

Gemma Scope provides transparency into the model's decision-making processes using sparse autoencoders, aiding researchers in understanding AI behavior. These tools support developers in creating safe, responsible AI applications, underscoring Google’s commitment to ethical AI development.

GOOGLE

OpenAI launches experimental GPT-4o Long Output model with 16X token capacity

OpenAI has introduced the GPT-4o Long Output model, increasing output capacity to 64,000 tokens, a 16-fold increase from the original 4,000 tokens. This model, designed for detailed and extensive outputs like code editing and writing improvement, maintains the 128,000 token context window.

Users can now receive longer responses while managing input-output trade-offs. Priced at $6 per million input tokens and $18 per million output tokens, this model aims to be accessible to developers. Currently in alpha testing with a few partners, wider availability will depend on initial feedback.

VENTUREBEAT

Our New AI Model Can Segment Anything – Even Video

Meta has introduced the Segment Anything Model 2 (SAM 2), a unified model capable of identifying and tracking objects in images and videos in real-time. SAM 2 advances the capabilities of its predecessor by segmenting objects and following them across video frames, which facilitates video editing and mixed reality experiences.

This model can be applied in various fields such as science, medicine, and autonomous vehicles. By sharing SAM 2 research openly, Meta aims to inspire further innovation within the AI community.

Midjourney drops surprise v6.1 update — now humans look more real than ever

Midjourney has released version 6.1 of its AI image generation platform, enhancing the realism of human skin and improving text legibility. This update, while numbered iteratively, brings significant improvements to how arms, legs, hands, and bodies are depicted, along with upgraded texture mapping.

The new model is 25% faster and introduces better image and texture quality through a new upscaling mode. Enhanced text rendering now accurately displays quoted words. Overall, version 6.1 offers more nuanced, detailed, and realistic outputs, with changes that make it feel like a major upgrade.

TOM'S GUIDE

Runway announces even faster, cheaper AI video model Gen-3 Alpha Turbo

Runway has launched the Gen-3 Alpha Turbo, a faster version of its AI video model, announced on the social platform X. This new model generates videos seven times faster than the previous version, creating a 10-second video in 11 seconds. It aims to provide high-quality generative AI video at a lower cost, potentially increasing usage and revenue through subscription plans.

Despite facing criticism for its data scraping methods and ongoing lawsuits over copyright violations, Runway continues to innovate, maintaining its competitive edge in the AI video generation market.

VENTUREBEAT

Amazon Unveiled the Latest AI Chip, Performance up by 50%

Amazon has introduced a new AI chip that enhances performance by 50%, aiming to compete with NVIDIA. Developed at Amazon’s chip lab in Austin, Texas, these processors are part of Amazon’s strategy to reduce reliance on expensive NVIDIA chips and lower costs for AWS AI cloud services.

Despite being a newcomer in AI chips, Amazon's non-AI chip, Graviton, has seen success. The new AI chips, Trainium and Inferentia, promise 40%-50% higher performance at about half the cost of NVIDIA’s comparable models. AWS, contributing to 20% of Amazon’s revenue, deployed thousands of these chips during Prime Day to handle increased platform activity.

TRENDFORCE

🧠RESEARCH

The Llama 3 Herd of Models

The Llama 3 paper introduces a new set of AI foundation models supporting multiple languages, coding, reasoning, and tool use. The largest model, with 405 billion parameters, rivals GPT-4's performance. Llama 3 includes safety features and can handle image, video, and speech tasks. Pre-trained versions are publicly available.

Meltemi: The first open Large Language Model for Greek

The Meltemi 7B paper introduces the first open large language model for Greek, with 7 billion parameters trained on a 40 billion token Greek corpus. Meltemi 7B and its instruction-tuned chat variant, Meltemi 7B Instruct, emphasize safety and alignment. Both models are accessible on Hugging Face under the Apache 2.0 license.

SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages

The SeaLLMs 3 paper introduces multilingual large language models designed for Southeast Asian languages, including Indonesian, Vietnamese, and Thai. Addressing the lack of language technology support in this region, SeaLLMs 3 reduces training costs while excelling in various tasks. The model prioritizes safety, reliability, and cultural considerations, demonstrating inclusive AI's potential for underserved communities.

SHIC: Shape-Image Correspondences with no Keypoint Supervision

The SHIC paper presents a method for mapping object surfaces to 3D templates without manual supervision, outperforming traditional methods. By leveraging foundational vision models like DINO and Stable Diffusion, SHIC predicts image correspondences through template renders, emulating manual annotation. This approach enhances canonical maps' quality for various objects, improving realism and accuracy.

Knesset-DictaBERT: A Hebrew Language Model for Parliamentary Proceedings

The Knesset-DictaBERT paper introduces a Hebrew language model fine-tuned on Israeli parliamentary proceedings. Based on DictaBERT, this model excels in understanding parliamentary language, showing significant improvements in perplexity and accuracy. The detailed evaluation highlights its superior performance compared to the baseline DictaBERT model.

🛠️TOP TOOLS

OpenAI Advanced Voice - Advanced Voice Mode offers more natural, real-time conversations, allows you to interrupt anytime, and senses and responds to your emotions.

Midjourney 6.1 - Uses text to explore new mediums of thought and expanding the imaginative powers of the human species

Gen-3 Alpha by Runway - A new frontier for fast, high-fidelity, controllable video generation

Meta AI Studio - Create an AI character based on their interests, and creators can build an AI extension of themselves.

Excel Dashboard AI - Convert Excel to Dashboard and Report with AI in 2 Clicks

📲SOCIAL MEDIA

Users in this alpha will receive an email with instructions and a message in their mobile app. We'll continue to add more people on a rolling basis and plan for everyone on Plus to have access in the fall. As previously mentioned, video and screen sharing capabilities will launch… x.com/i/web/status/1…
— OpenAI (@OpenAI)
6:30 PM • Jul 30, 2024

🗞️MORE NEWS

Supercharging Leonardo.Ai’s Innovation with Canva

Leonardo.Ai, a generative AI platform, joins Canva to enhance creativity tools. This collaboration aims to accelerate innovation, expand research, and integrate Leonardo.Ai features into Canva. Despite the merger, Leonardo.Ai will remain an independent platform, continuing to offer its tools and support to its community. LEONARDO AI

SoundHound AI Rolls Out Voice Assistant with Generative AI to Alfa Romeo and Citroën Vehicles Across Europe

SoundHound AI has launched its Chat AI voice assistant, integrated with ChatGPT, in Alfa Romeo and Citroën vehicles across Europe. This follows earlier launches in Peugeot, Opel, and Vauxhall. The assistant enhances in-car experiences with conversational AI, offering features like navigation, trip planning, and real-time information, significantly increasing user engagement and satisfaction. SOUNDHOUND

Vimeo Announces AI-Powered Video Translation with Authentic Voice Cloning

Vimeo has launched an AI-powered video translation solution that translates video, audio, and captions into multiple languages while replicating the original speaker's voice. This innovation significantly reduces the time and cost of traditional translation methods, helping businesses reach global audiences, onboard international employees, and conduct multilingual training efficiently. The solution supports over 50 languages and offers seamless localization for various business needs. YAHOO!

Microsoft says OpenAI is now a competitor in AI and search

Microsoft has added OpenAI to its list of competitors in its annual report, despite their long-term partnership. This follows OpenAI's announcement of a new search engine prototype, SearchGPT. Microsoft, having invested $13 billion in OpenAI, continues to collaborate with them while also positioning its own AI offerings like Copilot in Bing and Windows. CNBC

Musk denies report xAI is considering acquiring Character.AI

Elon Musk has denied reports that his AI startup, xAI, is considering acquiring Character.AI. This comes after speculation that xAI was looking to expand its capabilities by testing its Grok AI models with Character.AI. Musk clarified the situation on the social media platform X. REUTERS

TikTok is one of Microsoft’s biggest AI cloud computing customers

TikTok has been paying Microsoft nearly $20 million monthly for access to OpenAI’s models, contributing significantly to Microsoft’s cloud division revenue. However, TikTok’s parent company, ByteDance, is developing its own large language model, which might reduce its reliance on Microsoft. This practice has raised concerns, leading OpenAI to suspend ByteDance’s account for potential violations of its developer license. Despite these issues, Microsoft continues to see strong revenue growth from its Azure cloud services. THE VERGE

Apple's AI Features Rollout Will Miss Upcoming iPhone Software Overhaul

Apple's new AI features, part of its Apple Intelligence initiative, will be delayed and will not be included in the initial iOS 18 and iPadOS 18 releases scheduled for September. Instead, these AI enhancements will be rolled out through software updates by October to allow more time for bug fixes. This strategic delay aims to ensure a smoother user experience and maintain product quality. BLOOMBERG

What'd you think of today's edition?

Learn AI with us.

Let’s Build the Future Together.

Hello fellow AI-obsessed traveler,

Over the past 2 years, as we’ve grown to over 250,000 subscribers between the YouTube Channel and this newsletter, we've received an overwhelming number of requests for one specific thing.

While the newsletter helps keep you up to speed with AI news, many of you have asked for the next step: to learn how to actually apply AI in your work.

Today we’re finally announcing the solution with NATURAL 20, the community for like-minded AI learners. As a loyal newsletter reader you are getting access at the lowest price it will ever be:

JOIN NATURAL 20 AI UNIVERSITY TODAY

What you get:

* Tutorials by experts across various AI fields.

* Daily tutorials by Wes Roth about the latest use cases.

* Building Autonomous AI Agents to Automate Your Life and Business (NEW!)

* A network of the top 1% of early AI adopters.

* Access to community-only resources and software.

* And many more features rolling out soon.

Reply

or to participate.