• NATURAL 20
  • Posts
  • Tencent's Hunyuan Video Transforms Creation

Tencent's Hunyuan Video Transforms Creation

PLUS: Amazon and Anthropic’s AI Supercomputer, Hume AI Launches Voice Customization and more.

In partnership with

Tackle your credit card debt by paying 0% interest until 2026

If you have outstanding credit card debt, getting a new 0% intro APR credit card could help ease the pressure while you pay down your balances. Our credit card experts identified top credit cards that are perfect for anyone looking to pay down debt and not add to it! Click through to see what all the hype is about.

Today:

  • Tencent's Hunyuan Video Transforms Creation

  • Google Unveiled Veo and Imagen 3

  • Luma Image Generation Models

  • Amazon and Anthropic’s AI Supercomputer

  • Hume AI Launches Voice Customization

Tencent has introduced Hunyuan Video, an open-source AI model for text-to-video generation with 13 billion parameters. Known for its superior video quality, it outperforms models like Runway Gen-3 and Luma. Key features include a Multimodal Large Language Model for better text-image alignment and a 3D Variational Autoencoder for efficient processing. 

This release advances video generation technology, bridging the gap between open-source and proprietary solutions. By providing access to its code and model weights, Hunyuan Video empowers the community to explore and innovate in the field of AI-driven video creation.

Google Cloud has introduced two powerful generative AI models, Veo and Imagen 3, on its Vertex AI platform. Veo enables businesses to create high-quality videos from text or image prompts, while Imagen 3 generates photorealistic images with exceptional detail from text prompts. These tools streamline creative processes, reducing production time and costs. 

With built-in safety features like digital watermarking and safety filters, Google ensures responsible use of AI. Leading companies like Mondelez and WPP are already leveraging these models to enhance content creation. Veo and Imagen 3 aim to revolutionize marketing, advertising, and creative industries by enabling rapid, high-quality media generation.

Luma Photon introduces image generation models that combine high creativity, intelligence, and efficiency. The new architecture delivers ultra-high-quality images at a significantly lower cost—up to 10 times more efficient than existing models. These models excel in understanding natural language instructions and enabling multi-turn workflows for ideation and editing. 

Designed for designers, filmmakers, and visual thinkers, Luma Photon allows for consistent characters and iterative prompts with just one image. Available through Luma API and Dream Machine service, Luma Photon sets a new standard for visual intelligence, offering creative freedom and affordable, high-quality image generation.

Amazon is building one of the world's largest AI supercomputers in collaboration with Anthropic, aiming to advance artificial intelligence. The supercomputer, called Project Rainer, will be five times more powerful than Anthropic’s current model and use Amazon’s advanced Trainium 2 chips. 

This move positions Amazon to compete more strongly with companies like Microsoft and Google in generative AI. Additionally, Amazon announced new tools to help businesses manage and improve AI, including a system to verify chatbot accuracy and a method for optimizing smaller AI models. These innovations highlight Amazon’s growing influence in the AI sector.

Hume AI has introduced Voice Control, a tool that allows users to create custom AI voices without any coding skills. It offers sliders to adjust vocal traits like confidence, enthusiasm, and smoothness, making it easy for developers to design voices that suit specific needs. This tool builds on Hume’s earlier Empathic Voice Interface (EVI 2), which focused on emotional responsiveness and naturalness in voice AI. 

Voice Control aims to replace preset voices and address voice cloning risks by enabling highly personalized voice options. It's available in beta and integrates with Hume’s platform for various applications like customer service and virtual assistants.

🧠RESEARCH

X-Prompt is a new vision-language model that improves image generation by using in-context learning, similar to how language models work with text. It can handle both familiar and new tasks efficiently by processing context examples and generalizing to unseen tasks. Experiments show it performs well in diverse image generation tasks.

Open-Sora Plan is an open-source project focused on creating a large model for generating high-resolution, long-duration videos based on user inputs. It combines several advanced components and strategies for efficient training and data curation. The project achieves strong results in video generation and offers its code and model weights to the research community.

GATE OpenING introduces a benchmark for evaluating interleaved image-text generation, featuring 5,400 annotated examples across 56 real-world tasks. It assesses multimodal understanding and generation in scenarios like travel and design. The IntJudge model, achieving 82.42% human agreement, highlights gaps in current methods, guiding future improvements. OpenING is publicly accessible.

VISTA introduces a framework to improve video models' understanding of long-duration and high-resolution videos by synthesizing enhanced datasets. Using spatial and temporal augmentation, it generates extended videos with paired questions and answers. Finetuning on the VISTA-400K dataset boosts model performance, achieving notable gains on new benchmarks like HRVideoBench.

SOLAMI is a framework designed to enable 3D autonomous characters to interact socially with humans. It integrates vision, language, and action (VLA) to create realistic, responsive characters that understand and react to multimodal inputs. Using synthetic data and a VR interface, SOLAMI achieves more natural and accurate interactions with low latency.

🛠️TOP TOOLS

Modal - Serverless cloud for AI, ML, and data applications – built for developers

Colossyan - The AI video platform for workplace learning

Hume + Anthropic Computer Use - Create apps to control a computer with just your voice

Silatus - Write articles, speeches, job descriptions, press releases, memos, and more. 

Snappy Retro - Create your retro board in seconds, share the encrypted unique URL, and collaborate in real-time.

📲SOCIAL MEDIA

🗞️MORE NEWS

  • Amazon announced Nova, a family of generative AI models at its re:Invent 2024 conference. It includes text-focused models (Micro, Lite, Pro, Premier), image and video generation models (Nova Canas, Nova Reel), and future multimodal models for various media.

  • AKOOL and LiveX AI have partnered to enhance customer engagement with advanced conversational AI and dynamic avatar technology. This collaboration delivers human-like virtual agents capable of real-time problem-solving and empathetic interactions, improving satisfaction and loyalty.

  • NeuroAI for AI safety draws on the human brain’s mechanisms to create safer AI. By understanding neural systems, we can develop AI aligned with human values, enhancing transparency, robustness, and ethical behavior while accelerating neuroscience and neurotechnology.

  • Sakana AI introduces CycleQD, an evolutionary AI framework that uses model merging and mutation to evolve a diverse population of small, specialized models. This approach outperforms traditional methods, providing sustainable, high-performance AI across tasks.

  • MIT researchers developed a photonic chip that uses light to perform deep neural network computations. This device enhances speed and energy efficiency, achieving ultra-low latency and high accuracy, with potential for real-time AI applications in fields like telecommunications and research.

  • Skyflow has launched a new security and privacy solution for Agentic AI, enabling businesses to build trustworthy AI agents. It ensures data protection, compliance with global regulations, and safeguards sensitive data throughout the AI lifecycle, from training to execution.

What'd you think of today's edition?

Login or Subscribe to participate in polls.

Learn AI with us.

Let’s Build the Future Together.

Hello fellow AI-obsessed traveler,

Over the past 2 years, as we’ve grown to over 250,000 subscribers between the YouTube Channel and this newsletter, we've received an overwhelming number of requests for one specific thing.

While the newsletter helps keep you up to speed with AI news, many of you have asked for the next step: to learn how to actually apply AI in your work.

Today we’re finally announcing the solution with NATURAL 20, the community for like-minded AI learners. As a loyal newsletter reader you are getting access at the lowest price it will ever be:

 JOIN NATURAL 20 AI UNIVERSITY TODAY

What you get:

* Tutorials by experts across various AI fields.

* Daily tutorials by Wes Roth about the latest use cases.

* Building Autonomous AI Agents to Automate Your Life and Business (NEW!)

* A network of the top 1% of early AI adopters.

* Access to community-only resources and software.

* And many more features rolling out soon.

Reply

or to participate.