NATURAL 20
Posts
OpenAI Launches Sora AI

OpenAI Launches Sora AI

PLUS: Jack Ma Talks AI Future and more.

Wes Roth
December 10, 2024

In partnership with

SUBSCRIBE | AI TOOLS | LEARN AI

Writer RAG tool: build production-ready RAG apps in minutes

Writer RAG Tool: build production-ready RAG apps in minutes with simple API calls.
Knowledge Graph integration for intelligent data retrieval and AI-powered interactions.
Streamlined full-stack platform eliminates complex setups for scalable, accurate AI workflows.

Learn more about our production ready RAG tooling here.

Today:

OpenAI Launches Sora AI
Google Unveils Willow Quantum Processor
Nvidia Faces Chinese Antitrust Investigation
Jack Ma Talks AI Future

OpenAI Launches Sora AI

OpenAI has launched Sora, an AI model that creates realistic videos from text. The latest version, Sora Turbo, is faster and allows users to generate videos up to 1080p resolution. It features a new interface, including tools for precise input and community-driven content feeds. Available to ChatGPT Plus users, it offers 50 monthly videos at 480p or fewer at 720p.

A Pro plan provides higher usage limits and quality. OpenAI emphasizes safety by blocking harmful content and using metadata and watermarks for transparency. Sora's release aims to foster creativity while addressing potential misuse.

Google Unveils Willow Quantum Processor

Google's Quantum AI team has introduced Willow, a groundbreaking quantum processor that improves performance by exponentially reducing error rates as more qubits are added. This achievement in quantum error correction marks a significant milestone in making quantum computing more reliable. Willow utilizes surface code quantum computing, where qubits are grouped into larger lattices to correct errors.

The key discovery is that as the lattice size increases, the error rate of the encoded qubit decreases exponentially, achieving error suppression as theorized for nearly 30 years. This breakthrough is essential for scaling up quantum computers for practical applications in fields like chemistry, drug discovery, and cryptography.

Nvidia Faces Chinese Antitrust Investigation

China has launched an antitrust investigation into Nvidia, focusing on allegations that the chipmaker violated conditions set during its 2020 acquisition of Israeli network company Mellanox for $6.9 billion. Specifically, Chinese regulators claim Nvidia failed to provide new Mellanox product information to local competitors within the required 90-day window, potentially stifling competition.

This investigation comes as Nvidia’s value has soared, driven by the growing demand for AI chips. In parallel, the U.S. Justice Department is also scrutinizing Nvidia for monopolistic practices, amid escalating tensions between the U.S. and China over advanced technology and trade restrictions.

Jack Ma Talks AI Future

Jack Ma, the founder of Alibaba Group, made a rare public appearance at the 20th anniversary of Ant Group Co., where he emphasized the company’s push for AI-driven innovation. This marks a significant moment for Ma, as it comes after a government crackdown in 2020 that led to the scrapping of Ant Group's planned initial public offering (IPO), which would have been the largest in history.

During the event, Ma rallied employees and urged them to focus on AI development, signaling his ongoing commitment to shaping the future of technology despite the regulatory hurdles Ant Group and Alibaba have faced in recent years.

🧠RESEARCH

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

InternVL 2.5 is an advanced open-source multimodal model that enhances performance through improved training, testing, and data quality. It outperforms previous versions and rivals commercial models like GPT-4 and Claude-3.5. Notably, it surpasses 70% on the MMMU benchmark, setting new standards in multimodal AI systems.

LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment

LiFT introduces a method for improving text-to-video (T2V) models by using human feedback. It creates a dataset with 10k annotations to train a reward model, which guides the alignment of generated videos with human preferences. The approach enhances video quality, outperforming larger models like CogVideoX-5B on all metrics.

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

MAmmoTH-VL introduces a scalable method for creating large multimodal instruction-tuning datasets that include detailed rationales, designed to improve reasoning in open-source models. With 12M instruction-response pairs, the model shows significant improvements on tasks like MathVerse, MMMU-Pro, and MuirBench, outperforming previous benchmarks by up to 13.3%.

GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration

GenMAC introduces a multi-agent framework for compositional text-to-video generation, addressing challenges like object interactions and temporal dynamics. It uses a collaborative, iterative process with specialized agents for verification, suggestion, correction, and output structuring. This approach improves video quality and achieves state-of-the-art results in complex text-to-video tasks.

Mind the Time: Temporally-Controlled Multi-Event Video Generation

MinT introduces a method for generating multi-event videos with precise temporal control, addressing limitations of current models that struggle with event sequencing. By binding each event to a specific time and using time-aware positional encoding (ReRoPE), MinT produces coherent, smoothly connected videos, outperforming existing models in generating temporally grounded sequences.

🛠️TOP TOOLS

IMGCreator.ai - ImgCreator.ai is an AI-driven platform designed to transform text descriptions into visually captivating images, making it an ideal tool for creating illustrations, anime, concept art, and photorealistic visuals.

Namelix - Namelix is an AI-powered tool designed to simplify the process of generating creative, brandable business names.

ElevenLabs - ElevenLabs is a cutting-edge AI platform that offers text-to-speech and voice generation services, allowing users to create lifelike audio content in multiple languages and accents.

Vidyo.ai - Vidyo.ai is an AI-powered video repurposing platform that transforms long-form content into short, engaging clips optimized for social media.

Skipit.ai - Skipit.ai is an AI-powered content summarization tool designed to streamline information consumption across various digital formats.

📲SOCIAL MEDIA

Sora is out!
If you've had a hard time signing up, it's because I've been overloading the servers with a million requests a minute... sorry.
Here are 10 of the best clips I've been able to make so far:
(some of these I've used the recut/remix feature)
— Wes Roth (@WesRothMoney)
8:57 PM • Dec 9, 2024

🗞️MORE NEWS

Alexis Conneau, former OpenAI researcher, helped create ChatGPT’s voice before leaving to co-found WaveForms. His new startup aims to enhance AI's emotional intelligence, focusing on making AI interactions feel more human. Conneau’s previous work with ChatGPT gained attention, especially after it mimicked Scarlett Johansson’s voice without permission.
Reddit is testing a new AI-powered feature, Reddit Answers, which helps users find relevant information from the platform’s posts. The tool curates summaries of conversations and links to related content, competing with AI-driven search engines like OpenAI and Perplexity. Initially available in the U.S. and in English, it will expand globally.
Amazon has launched a new AI research lab in San Francisco, the Amazon AGI SF Lab, focused on developing AI agents that can perform real-world tasks and complex workflows. Led by Adept co-founder David Luan, the lab will build on Amazon’s broader AI initiatives, with a focus on agents that learn from human feedback and self-correct.
Itch.io, an indie game marketplace, went offline due to a false phishing report triggered by Funko’s AI-powered Brand Shield software. The site’s domain was mistakenly disabled, causing a disruption that lasted several hours. Brand Shield’s CEO clarified that they only requested takedown of a specific URL, not the entire domain. Itch.io is now back online.
Scholarly publishers are licensing their content to technology companies for training AI models, generating millions in revenue. Deals with firms like Microsoft and Wiley highlight this growing trend. Publishers are weighing the impact on revenue and copyright concerns, while ensuring royalties for authors and strict usage terms.

What'd you think of today's edition?

Reply

or to participate.