NATURAL 20
Posts
DeepSeek’s Janus-Pro Outpaces DALL-E 3

DeepSeek’s Janus-Pro Outpaces DALL-E 3

PLUS: AI Safety Concerns Loom Over $2.7B Character Deal, OpenAI-Backed 1X Acquires Kind Humanoid and more.

Wes Roth
January 28, 2025

In partnership with

SUBSCRIBE | AI TOOLS | LEARN AI

Automate Prospecting Local Businesses With Our AI BDR

Struggling to identify local prospects? Our AI BDR Ava taps into a database of 200M+ local Google businesses and does fully autonomous outreach—so you can focus on closing deals, not chasing leads.

Ava operates within the Artisan platform, which consolidates every tool you need for outbound:

300M+ High-Quality B2B Prospects
Automated Lead Enrichment With 10+ Data Sources Included
Full Email Deliverability Management
Personalization Waterfall using LinkedIn, Twitter, Web Scraping & More

Book a demo and supercharge your sales team

Today:

DeepSeek’s Janus-Pro Outpaces DALL-E 3
Alibaba’s Qwen2.5-VL Controls Devices
Pika Labs Delivers Stunning 1080p AI Videos
AI Safety Concerns Loom Over $2.7B Character Deal
OpenAI-Backed 1X Acquires Kind Humanoid

DeepSeek’s Janus-Pro Outpaces DALL-E 3

DeepSeek, a Chinese AI company, launched Janus-Pro, a multimodal AI model family available on Hugging Face under an MIT license for commercial use. Ranging from 1 to 7 billion parameters, Janus-Pro reportedly outperforms OpenAI’s DALL-E 3 and other models in benchmarks, despite handling smaller images. DeepSeek's rapid rise, including topping the App Store charts, raises questions about U.S. AI dominance and the future demand for AI chips.

Alibaba’s Qwen2.5-VL Controls Devices

Alibaba’s Qwen team has unveiled Qwen2.5-VL, a new family of AI models that can analyze text, images, and videos while controlling PCs and mobile devices. The models outperform competitors like OpenAI’s GPT-4o and Google’s Gemini 2.0 on benchmarks. Available on Hugging Face, smaller models have permissive licenses, but the flagship Qwen2.5-VL-72B requires Alibaba’s approval for commercial use by companies with over 100 million users.

Pika Labs Delivers Stunning 1080p AI Videos

Pika Labs has unveiled Pika 2.1, an AI video generation model delivering 1080p resolution, realistic motion, and lifelike human characters. Users praise its cinematic-quality output, enhanced physics-based motion, and better alignment with text prompts. Early adopters have showcased diverse applications, from movie-like scenes to creative projects, sparking discussions about the evolution of AI video technology. The update sets a new standard in generating visually stunning and seamless AI-driven videos.

AI Safety Concerns Loom Over $2.7B Character Deal

Character.AI faced significant challenges as its chatbot app gained popularity, particularly among teenagers. Legal issues and safety concerns arose, including lawsuits linking chatbots to harmful content. Google and Apple pressured the company to implement stricter filters, and despite efforts to curb inappropriate interactions, controversies persisted. Google’s $2.7 billion licensing deal allowed Character to buy out investors while its co-founders rejoined Google to work on AI. Character now focuses on safer, entertainment-focused AI.

OpenAI-Backed 1X Acquires Kind Humanoid

OpenAI-backed robotics company 1X has acquired Kind Humanoid, a Norwegian startup specializing in humanoid robots. Known for enlisting designer Yves Béhar, Kind focused on integrating large language models into general-purpose humanoid robots. The acquisition boosts 1X's Bay Area presence and aligns with its mission to develop intelligent humanoids that coexist with humans. Supported by high-profile investors like Tiger Global and OpenAI, 1X aims to advance the field of robotics in 2025.

🧠RESEARCH

Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling

Janus-Pro enhances its predecessor with optimized training, expanded data, and larger model sizes. It improves multimodal understanding and text-to-image generation, achieving state-of-the-art performance. Innovations include decoupled visual encoding and synthetic data integration, boosting efficiency, scalability, and output quality. Janus-Pro inspires advancements in unified multimodal AI. Models and code are public.

IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models

The IMAGINE-E framework evaluates text-to-image (T2I) models like FLUX.1, Ideogram2.0, Dall-E3, and Stable Diffusion 3 across five domains, including realism, structured output, and challenging scenarios. FLUX.1 and Ideogram2.0 excel in domain-specific tasks, showcasing T2I models' potential for general-purpose AI applications. Evaluation scripts will be publicly available.

Redundancy Principles for MLLMs Benchmarks

This paper examines redundancy in Multi-modality Large Language Model (MLLM) benchmarks, analyzing capability dimensions, test question quantity, and cross-benchmark overlaps. By evaluating hundreds of models across 20+ benchmarks, the study quantifies redundancy, offers insights to streamline evaluations, and proposes principles for constructing more effective and efficient MLLM benchmarks.

Chain-of-Retrieval Augmented Generation

CoRAG, a novel Chain-of-Retrieval Augmented Generation method for step-by-step retrieval and reasoning before generating answers. It outperforms conventional RAG models, excelling in multi-hop question answering and achieving state-of-the-art results on the KILT benchmark. CoRAG's dynamic query reformulation and advanced decoding strategies significantly improve effectiveness in complex queries.

RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques

RealCritic introduces a new benchmark for evaluating Large Language Models' (LLMs) critique capabilities, focusing on self-critique, cross-critique, and iterative critique. Using a closed-loop approach, it assesses how effectively critiques improve outputs across eight reasoning tasks. Findings reveal advanced reasoning models outperform classical LLMs in critique scenarios, highlighting areas for improvement.

🛠️TOP TOOLS

Podwise - AI-powered podcast platform that transforms the way users consume and interact with podcast content.

FlowGPT - Visual interface that helps you create anything fast.

Kaiber - AI-powered creative platform that transforms text, images, and audio into captivating video content.

DeepL Translate - Online tool that offers fast, accurate translations for individuals and teams across more than 30 languages.

Text-To-Pokemon - AI-powered tool that transforms textual descriptions into unique Pokémon-style characters.

📲SOCIAL MEDIA

AI Apocalypse!
[brought to you by DeepSeek]
Here's what YOU need to know:
🔥🧵👇
— Wes Roth (@WesRothMoney)
12:14 AM • Jan 28, 2025

🗞️MORE NEWS

Elon Musk’s xAI might soon release Grok 3, an AI model trained with enhanced compute power. Users briefly accessed it, revealing impressive features, coding errors, and controversial behavior tweaks, signaling its anticipated launch.
DeepSeek's AI app, rivaling ChatGPT, restricted new sign-ups after "malicious attacks" disrupted services. Existing users remain unaffected, while registration via Google, Apple ID, and email is still possible.
Apple’s latest software update activates Apple Intelligence AI by default, enhancing features like text rewriting and image generation. However, AI news summaries were disabled after inaccuracies, marking a cautious rollout for broader adoption.
Pocket Worlds, creator of Highrise, acquired AI-driven Infinite Canvas to enhance user-generated content and creator tools. This move aims to expand Highrise's gaming library, empowering young adults with innovative, tailored virtual experiences.
Halliday AI glasses offer discreet, lightweight smart eyewear with a tiny module projecting visuals above the user’s eye. Priced at $399, these Kickstarter-funded glasses provide fast AI assistance for simple, everyday tasks.
France's AI chatbot Lucie, released prematurely, was taken offline after errors like recommending "cow’s eggs" and failing math problems. Developers aim to refine it in private beta before a public relaunch.

What'd you think of today's edition?

Reply

or to participate.