OpenAI Launches Sora AI

PLUS: Jack Ma Talks AI Future and more.

In partnership with

The fastest way to build AI apps

  • Writer Framework: build Python apps with drag-and-drop UI

  • API and SDKs to integrate into your codebase

  • Intuitive no-code tools for business users

Today:

  • OpenAI Launches Sora AI

  • Google Unveils Willow Quantum Processor

  • Nvidia Faces Chinese Antitrust Investigation

  • Jack Ma Talks AI Future

OpenAI has launched Sora, an AI model that creates realistic videos from text. The latest version, Sora Turbo, is faster and allows users to generate videos up to 1080p resolution. It features a new interface, including tools for precise input and community-driven content feeds. Available to ChatGPT Plus users, it offers 50 monthly videos at 480p or fewer at 720p. 

A Pro plan provides higher usage limits and quality. OpenAI emphasizes safety by blocking harmful content and using metadata and watermarks for transparency. Sora's release aims to foster creativity while addressing potential misuse.

Google's Quantum AI team has introduced Willow, a groundbreaking quantum processor that improves performance by exponentially reducing error rates as more qubits are added. This achievement in quantum error correction marks a significant milestone in making quantum computing more reliable. Willow utilizes surface code quantum computing, where qubits are grouped into larger lattices to correct errors.

The key discovery is that as the lattice size increases, the error rate of the encoded qubit decreases exponentially, achieving error suppression as theorized for nearly 30 years. This breakthrough is essential for scaling up quantum computers for practical applications in fields like chemistry, drug discovery, and cryptography.

China has launched an antitrust investigation into Nvidia, focusing on allegations that the chipmaker violated conditions set during its 2020 acquisition of Israeli network company Mellanox for $6.9 billion. Specifically, Chinese regulators claim Nvidia failed to provide new Mellanox product information to local competitors within the required 90-day window, potentially stifling competition. 

This investigation comes as Nvidia’s value has soared, driven by the growing demand for AI chips. In parallel, the U.S. Justice Department is also scrutinizing Nvidia for monopolistic practices, amid escalating tensions between the U.S. and China over advanced technology and trade restrictions.

Jack Ma, the founder of Alibaba Group, made a rare public appearance at the 20th anniversary of Ant Group Co., where he emphasized the company’s push for AI-driven innovation. This marks a significant moment for Ma, as it comes after a government crackdown in 2020 that led to the scrapping of Ant Group's planned initial public offering (IPO), which would have been the largest in history. 

During the event, Ma rallied employees and urged them to focus on AI development, signaling his ongoing commitment to shaping the future of technology despite the regulatory hurdles Ant Group and Alibaba have faced in recent years.

🧠RESEARCH

InternVL 2.5 is an advanced open-source multimodal model that enhances performance through improved training, testing, and data quality. It outperforms previous versions and rivals commercial models like GPT-4 and Claude-3.5. Notably, it surpasses 70% on the MMMU benchmark, setting new standards in multimodal AI systems.

LiFT introduces a method for improving text-to-video (T2V) models by using human feedback. It creates a dataset with 10k annotations to train a reward model, which guides the alignment of generated videos with human preferences. The approach enhances video quality, outperforming larger models like CogVideoX-5B on all metrics.

MAmmoTH-VL introduces a scalable method for creating large multimodal instruction-tuning datasets that include detailed rationales, designed to improve reasoning in open-source models. With 12M instruction-response pairs, the model shows significant improvements on tasks like MathVerse, MMMU-Pro, and MuirBench, outperforming previous benchmarks by up to 13.3%.


GenMAC introduces a multi-agent framework for compositional text-to-video generation, addressing challenges like object interactions and temporal dynamics. It uses a collaborative, iterative process with specialized agents for verification, suggestion, correction, and output structuring. This approach improves video quality and achieves state-of-the-art results in complex text-to-video tasks.

MinT introduces a method for generating multi-event videos with precise temporal control, addressing limitations of current models that struggle with event sequencing. By binding each event to a specific time and using time-aware positional encoding (ReRoPE), MinT produces coherent, smoothly connected videos, outperforming existing models in generating temporally grounded sequences.

🛠️TOP TOOLS

IMGCreator.ai - ImgCreator.ai is an AI-driven platform designed to transform text descriptions into visually captivating images, making it an ideal tool for creating illustrations, anime, concept art, and photorealistic visuals.

Namelix - Namelix is an AI-powered tool designed to simplify the process of generating creative, brandable business names.

ElevenLabs - ElevenLabs is a cutting-edge AI platform that offers text-to-speech and voice generation services, allowing users to create lifelike audio content in multiple languages and accents.

Vidyo.ai - Vidyo.ai is an AI-powered video repurposing platform that transforms long-form content into short, engaging clips optimized for social media.

Skipit.ai - Skipit.ai is an AI-powered content summarization tool designed to streamline information consumption across various digital formats.

📲SOCIAL MEDIA

🗞️MORE NEWS

  • Alexis Conneau, former OpenAI researcher, helped create ChatGPT’s voice before leaving to co-found WaveForms. His new startup aims to enhance AI's emotional intelligence, focusing on making AI interactions feel more human. Conneau’s previous work with ChatGPT gained attention, especially after it mimicked Scarlett Johansson’s voice without permission.

  • Reddit is testing a new AI-powered feature, Reddit Answers, which helps users find relevant information from the platform’s posts. The tool curates summaries of conversations and links to related content, competing with AI-driven search engines like OpenAI and Perplexity. Initially available in the U.S. and in English, it will expand globally.

  • Amazon has launched a new AI research lab in San Francisco, the Amazon AGI SF Lab, focused on developing AI agents that can perform real-world tasks and complex workflows. Led by Adept co-founder David Luan, the lab will build on Amazon’s broader AI initiatives, with a focus on agents that learn from human feedback and self-correct.

  • Itch.io, an indie game marketplace, went offline due to a false phishing report triggered by Funko’s AI-powered Brand Shield software. The site’s domain was mistakenly disabled, causing a disruption that lasted several hours. Brand Shield’s CEO clarified that they only requested takedown of a specific URL, not the entire domain. Itch.io is now back online.

  • Scholarly publishers are licensing their content to technology companies for training AI models, generating millions in revenue. Deals with firms like Microsoft and Wiley highlight this growing trend. Publishers are weighing the impact on revenue and copyright concerns, while ensuring royalties for authors and strict usage terms.

What'd you think of today's edition?

Login or Subscribe to participate in polls.

Reply

or to participate.