NATURAL 20
Posts
Sam Altman Hints 'Strawberry' Model

Sam Altman Hints 'Strawberry' Model

PLUS: ByteDance Launches Jimeng AI, Leaked Nvidia Documents Reveal AI Practices and more.

Wes Roth
August 09, 2024

SUBSCRIBE | JOIN AI FORUM | LEARN AI

Today:

Sam Altman Hints 'Strawberry' Model
OpenAI Introduced Structured Outputs API
OpenAI Invests $60M in Opal
OpenAI Unveils GPT-4o
ByteDance Launches Jimeng AI
Palantir, Microsoft Collaborate on AI Services
Reddit Tests AI Search Summaries
Leaked Nvidia Documents Reveal AI Practices

One BILLION Humanoid Robots and the OpenAI "Meltdown"

Figure AI has unveiled their advanced humanoid robot, merging large language models with robotics for improved visual reasoning and interaction. This innovation, supported by OpenAI’s GPT-4, enhances robot capabilities, including speech-to-speech communication.

Meanwhile, OpenAI faces leadership changes, with key figures starting new ventures in AI and education.

WATCH THE VIDEO ON YOUTUBE

Sam Altman stokes rumors of new OpenAI foundation model ‘Strawberry’

OpenAI CEO Sam Altman's social media post about strawberries fueled speculation about a new foundation model named "Strawberry." This model, hinted to be more advanced than GPT-4, might focus on autonomous internet navigation and deep research. Early tests of a related anonymous chatbot show superior reasoning capabilities, heightening industry excitement.

Introducing Structured Outputs in the API

OpenAI introduces Structured Outputs in the API, ensuring model responses conform to developer-supplied JSON Schemas. This feature improves reliability for generating structured data, enhancing applications like data entry and dynamic UI generation. Structured Outputs are available on models supporting function calling, offering significant cost savings and higher accuracy.

OpenAI Makes a $60 Million Hardware Startup Bet

OpenAI is leading a $60 million Series B funding round for Opal, a hardware startup known for its professional-grade webcams. Despite Opal's primary focus on webcams, this investment aligns with OpenAI CEO Sam Altman's interest in developing consumer AI devices. Notable backers include YouTuber Casey Neistat and TikTok's Charli and Dixie D’Amelio.

OpenAI quietly releases GPT-4o update amid leadership turmoil

OpenAI released an upgraded GPT-4o model, reducing costs by half and enhancing performance. This update follows significant leadership changes, including co-founder John Schulman's departure and president Greg Brockman's sabbatical. The release aims to reassert OpenAI's dominance amidst fierce competition from Meta, Google, and Anthropic. Speculation grows about "Project Strawberry," a rumored next-gen AI model.

ByteDance joins OpenAI's Sora rivals with AI video app launch

ByteDance has launched Jimeng AI, an app that creates videos from text prompts, available on the Apple App Store and Android. This follows the trend of Chinese tech firms developing similar tools after OpenAI introduced Sora. Jimeng AI, part of ByteDance's Faceu Technology, offers monthly and annual subscriptions, allowing users to create numerous images and videos.

Other Chinese companies like Kuaishou and Zhipu AI have also released text-to-video models, expanding access to AI-generated video content. Jimeng AI marks ByteDance's latest effort to innovate in the rapidly evolving AI video market.

REUTERS

Palantir and Microsoft Partner to Deliver Enhanced Analytics and AI Services to Classified Networks for Critical National Security Operations

Palantir Technologies and Microsoft have announced a partnership to enhance analytics and AI services for U.S. Defense and Intelligence operations. This collaboration will integrate Microsoft’s advanced language models via Azure OpenAI Service within Palantir’s AI platforms in secure government cloud environments.

The joint solution aims to support various national security tasks, including logistics and action planning, by combining Palantir’s data integration capabilities with Microsoft’s AI models. Palantir will also adopt Azure’s OpenAI Service for classified use.

MICROSOFT

Reddit to test AI-powered search result pages

Reddit will soon test AI-generated summaries at the top of search results to help users discover new content and communities. The feature, powered by both first-party and third-party technologies, is expected to be implemented later this year. This move follows Reddit’s partnership with OpenAI and Google to integrate AI into its platform.

Reddit's AI-powered language translation feature has already expanded to German, Spanish, and Portuguese, contributing to significant user growth. For the second quarter, Reddit reported 342.3 million weekly active users and a revenue increase to $281.2 million.

TECHCRUNCH

Leaked Documents Show Nvidia Scraping ‘A Human Lifetime’ of Videos Per Day to Train AI

Leaked documents reveal that Nvidia has been scraping vast amounts of videos from platforms like YouTube and Netflix to train its AI models. Internal communications indicate that despite concerns about legality and ethics, Nvidia defended its practices as compliant with copyright laws.

The data, used for training the company’s new AI projects, including its Omniverse 3D world generator and self-driving car systems, was gathered from various sources. Nvidia’s project, codenamed Cosmos, aims to develop advanced AI technologies but has not yet been publicly released.

404 MEDIA

🧠RESEARCH

Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks

Optimus-1, a new AI agent, excels in long-horizon tasks by using a Hybrid Multimodal Memory module. This module transforms knowledge into a directed graph and compiles historical data into a multimodal experience pool. Extensive experiments show Optimus-1 significantly outperforms existing agents and achieves near human-level performance in Minecraft tasks.

EXAONE 3.0 7.8B Instruction Tuned Language Model

EXAONE 3.0, developed by LG AI Research, is a 7.8B instruction-tuned language model. It excels in Korean and performs well on general tasks and complex reasoning. Extensive evaluations show its competitive performance against similar open models. EXAONE 3.0 aims to advance Expert AI and is available on Hugging Face.

MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models

MMIU is a new benchmark designed to evaluate Large Vision-Language Models (LVLMs) on multi-image tasks. Covering 7 relationship types, 52 tasks, 77K images, and 11K questions, MMIU reveals significant challenges in multi-image understanding. Top models, like GPT-4o, scored only 55.7% accuracy. MMIU aims to drive improvements in LVLMs for better multimodal comprehension.

LLaVA-OneVision: Easy Visual Task Transfer

LLaVA-OneVision is a new large multimodal model (LMM) that excels in single-image, multi-image, and video tasks. It demonstrates strong transfer learning across these modalities, showcasing enhanced video understanding and cross-scenario capabilities. LLaVA-OneVision sets new performance standards for open LMMs by leveraging insights from the LLaVA-NeXT blog series.

MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine

MedTrinity-25M is a vast multimodal medical dataset featuring over 25 million images across 10 modalities and detailed annotations for 65+ diseases. It includes global textual and local region-specific annotations like bounding boxes and segmentation masks. Pretrained on this dataset, models achieve top performance on medical AI tasks, enhancing future medical AI development.

🛠️TOP TOOLS

Semantic Scholar - A free, AI-powered research tool for scientific literature

Assembly AI by Zapier - Transcribe audio and extract data

Assista - Enhances business productivity through voice or text control of multiple productivity tools

Hippo Scribe - AI-Assisted Clinical Documentation

Adobe Podcast - Create high-quality podcasts and voiceovers that sound professional with Adobe Podcast.

📲SOCIAL MEDIA

i love summer in the garden
— Sam Altman (@sama)
3:29 PM • Aug 7, 2024

🗞️MORE NEWS

UK starts probe into Amazon's AI partnership with Anthropic

The UK's competition watchdog is investigating Amazon's partnership with AI startup Anthropic, following a similar probe into Alphabet's collaboration with the same startup. The investigation aims to assess potential competition concerns. Amazon and Anthropic assert their partnership does not breach competition laws and plan to cooperate fully with regulators. REUTERS

Build, tweak, repeat

Mistral AI announced advancements in developing generative AI applications, including simplified model customization on La Plateforme and the alpha release of Agents for custom workflows. Developers can fine-tune models using various methods and integrate AI with specific domain knowledge. Additionally, the stable version of the Mistral AI client SDK, mistralai 1.0, is now available. MISTRAL

OpenAI has built a text watermarking method to detect ChatGPT-written content — company has mulled its release over the past year

OpenAI has developed a tool to detect ChatGPT-generated content using a hidden pattern in the text, but is hesitating to release it due to concerns over accuracy, potential bias against non-native English writers, and customer backlash. While effective, the tool can be easily bypassed, leading OpenAI to explore other AI detection methods. TOM'S HARDWARE

Google Unveils New Gemini-AI Powered TV Streamer

Google has unveiled the Gemini-AI powered TV Streamer, a next-generation 4K device that combines entertainment and smart home functionality. Replacing the Chromecast, it offers faster performance and acts as a hub for Google Home and Matter devices. The streamer provides personalized content suggestions, access to over 700,000 movies and shows, and ambient mode features powered by AI. THE FAST MODE

Audible is testing an AI-powered search feature

Audible, an Amazon-owned audiobook company, is testing an AI-powered search feature called Maven, starting with select U.S. customers. Maven uses natural language processing to recommend audiobooks based on specific user requests. Available on iOS and Android, Maven currently covers a subset of Audible’s catalog. Audible is also experimenting with AI-curated collections and review summaries. TECHCRUNCH

Call for Applications: Llama 3.1 Impact Grants

Meta is launching the Llama 3.1 Impact Grants, offering up to $2 million to support projects using Llama 3.1 AI to tackle social challenges. This follows last year's successful inaugural grant program. The grants focus on economic development, science, public service, and more. Applications are open until November 22, 2024, with regional events providing further support. META

Elon Musk sues OpenAI again, alleging ‘deceit of Shakespearean proportions’

Elon Musk has filed a new lawsuit against OpenAI and its CEO, Sam Altman, claiming they manipulated him into co-founding the company and later betrayed its original mission of developing AI for the betterment of humanity. The lawsuit, filed in northern California, accuses Altman and others of turning OpenAI into a for-profit enterprise, breaking their founding agreement, and engaging in fraud. This follows Musk's withdrawal of a similar lawsuit earlier this year. OpenAI denies the allegations, portraying Musk as a disgruntled former partner. THE GUARDIAN

OpenAI co-founder Schulman leaves for Anthropic, Brockman takes extended leave

John Schulman, co-founder of OpenAI, has left the company for Anthropic. OpenAI's president and co-founder, Greg Brockman, is taking an extended leave. Schulman, who played a key role in creating ChatGPT and led AI alignment research, aims to deepen his focus on AI alignment at Anthropic. OpenAI confirmed the departures and expressed gratitude for Schulman's contributions. Schulman emphasized his decision is personal and not due to lack of support from OpenAI. With Schulman's exit, only three of OpenAI's original 11 founders remain. TECHCRUNCH

Intel reportedly gave up a chance to buy a stake in OpenAI in 2017

Intel had an opportunity to invest in OpenAI in 2017 but passed on it, believing AI models wouldn't become widespread soon. Former CEO Bob Swan rejected a $1 billion deal for a 15% stake. Instead, Intel prioritized other AI acquisitions like Nervana Systems and Habana Labs. Meanwhile, Microsoft invested heavily in OpenAI, making it a significant AI player. TOM'S HARDWARE

Meta courts celebs like Awkwafina to voice AI assistants ahead of Meta Connect

Meta is reportedly in talks with celebrities like Judi Dench, Keegan-Michael Key, and Awkwafina to voice its AI assistants, aiming to debut these features at the Meta Connect conference in September. The company is negotiating with top Hollywood talent agencies, offering potentially millions of dollars for temporary contracts. The voices would be integrated across Meta’s platforms, including Facebook, Instagram, and Meta Ray-Ban smart glasses.

THE VERGE

What'd you think of today's edition?

Learn AI with us.

Let’s Build the Future Together.

Hello fellow AI-obsessed traveler,

Over the past 2 years, as we’ve grown to over 250,000 subscribers between the YouTube Channel and this newsletter, we've received an overwhelming number of requests for one specific thing.

While the newsletter helps keep you up to speed with AI news, many of you have asked for the next step: to learn how to actually apply AI in your work.

Today we’re finally announcing the solution with NATURAL 20, the community for like-minded AI learners. As a loyal newsletter reader you are getting access at the lowest price it will ever be:

JOIN NATURAL 20 AI UNIVERSITY TODAY

What you get:

* Tutorials by experts across various AI fields.

* Daily tutorials by Wes Roth about the latest use cases.

* Building Autonomous AI Agents to Automate Your Life and Business (NEW!)

* A network of the top 1% of early AI adopters.

* Access to community-only resources and software.

* And many more features rolling out soon.

Reply

or to participate.