NATURAL 20
Posts
AlphaZero Inspires Reinforcement AI Leap

AlphaZero Inspires Reinforcement AI Leap

PLUS: China’s Tech Giants Tackle AI Limits, ChatGPT Gains Smarter Memory Features and more.

Wes Roth
May 27, 2025

In partnership with

SUBSCRIBE | AI TOOLS | LEARN AI

Find out why 1M+ professionals read Superhuman AI daily.

In 2 years you will be working for AI

Or an AI will be working for you

Here's how you can future-proof yourself:

Join the Superhuman AI newsletter – read by 1M+ people at top companies
Master AI tools, tutorials, and news in just 3 minutes a day
Become 10X more productive using AI

Join 1,000,000+ pros at companies like Google, Meta, and Amazon that are using AI to get ahead.

Today:

AlphaZero Inspires Reinforcement AI Leap
Google Launches Open-Source LMEval
OpenAI Acquires IO, Launches HealthBench
China’s Tech Giants Tackle AI Limits
ChatGPT Gains Smarter Memory Features

Demis Hassabis on the "Intelligence Explosion", Self-Improving AI and AlphaZero

Demis Hassabis and AI researchers explore merging reinforcement learning with large language models to create self-improving AI. Inspired by AlphaGo Zero’s success, models now learn through self-play without human data. A recent method, “Absolute Reasoner,” shows coding skills gained this way can also enhance math and reasoning.

This shift toward reinforcement-heavy training could accelerate AI progress, leading to systems that improve rapidly and generalize across tasks—possibly triggering a new AI leap.

WATCH THE VIDEO ON YOUTUBE

Google Launches Open-Source LMEval

Google launched LMEval, a new tool to easily compare AI models across text, images, and code. It streamlines evaluation by standardizing benchmarks, identifying when models avoid tricky questions, and storing results securely. LMEval supports multiple platforms and visualizes model performance clearly, saving researchers time and resources. This tool improves transparency, helping developers understand AI capabilities better, ultimately advancing safer and more effective AI use in the industry.

Why This Matters

Standardized Testing:
Allows consistent, fair comparison of AI models, making improvements clearer.
Safety Transparency:
Easily identifies when AI models avoid answering sensitive or harmful questions, enhancing model safety evaluations.
Improved Efficiency:
Reduces the time and cost required to evaluate AI systems, accelerating AI research and adoption.

OpenAI Acquires IO, Launches HealthBench

OpenAI introduced HealthBench, an AI standard for healthcare, developed with doctors globally to measure medical AI effectiveness. The company also acquired Jony Ive’s hardware startup IO for $6.5 billion, aiming to build innovative health-integrated devices. This reflects a major push by tech giants into healthcare AI, which could revolutionize patient care through personalized, trustworthy, and efficient AI systems, significantly improving patient outcomes and reducing healthcare costs.

Why This Matters

Healthcare-specific AI standards:
HealthBench ensures AI models reliably meet medical standards, fostering safer and more effective AI in healthcare.
Integration with hardware:
OpenAI's acquisition highlights a major shift toward AI-driven healthcare devices, enhancing real-time health monitoring and user interaction.
Cost and efficiency savings:
The trend of using AI in healthcare workflows promises substantial economic benefits and operational efficiencies, potentially transforming healthcare economics globally.

China’s Tech Giants Tackle AI Limits

Chinese tech giants Tencent and Baidu revealed strategies to navigate U.S. semiconductor restrictions impacting AI development. Tencent is stockpiling GPUs, optimizing software efficiency, and exploring homegrown chips. Baidu emphasizes its integrated infrastructure, improving software efficiency, and leveraging domestically developed semiconductors. These actions highlight China's determination to maintain AI competitiveness despite U.S. export controls, driving innovation in domestic semiconductor ecosystems and software optimization techniques.

Why This Matters

Global AI Competition:
Demonstrates China's resolve to remain competitive, potentially reshaping global AI technology leadership dynamics.
Semiconductor Innovation:
Spurs advancements in domestic semiconductor capabilities, reducing China's long-term reliance on Western technology.
AI Efficiency and Sustainability:
Encourages development of more efficient AI models, benefiting global AI research through reduced computational needs and resource optimization.

ChatGPT Gains Smarter Memory Features

OpenAI's recent ChatGPT upgrades significantly enhance its ability to remember personal details about users, making interactions more personalized and effective. The company is also expanding internationally, establishing a legal presence and office in South Korea. This development signals AI's evolution toward more intuitive, context-aware applications, offering users increasingly customized experiences and potentially transforming how consumers engage with AI across various platforms and industries.

Why This Matters

Personalization in AI:
Improves user experience by enabling chatbots to recall and utilize personal context, deepening user interactions.
Global Expansion of AI:
OpenAI's South Korea expansion underscores AI's international growth and its critical role in global technological competitiveness.
Future AI Capabilities:
Advances in chatbot memory reflect broader trends towards increasingly sophisticated, human-like AI, setting higher expectations for future applications.

🧠RESEARCH

TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations

TabSTAR is a new AI model that outperforms others on table-based tasks with text features. Unlike older models, it understands the goal of each task, making it more accurate. It doesn’t need special tweaks for each dataset and learns from many sources, showing strong results across different data sizes.

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

QwenLong-L1 is a new AI model designed to handle long documents more effectively. It uses a step-by-step training method with reinforcement learning to improve accuracy and stability. Tested on seven benchmarks, it outperforms top models like o3-mini and rivals Claude 3.7, showing strong reasoning across large amounts of text.

Quartet: Native FP4 Training Can Be Optimal for Large Language Models

Quartet is a new method for training large AI models using very low-precision math (FP4), cutting costs and power use without losing accuracy. Built for NVIDIA’s Blackwell chips, it avoids common problems with FP4 by using smart techniques and custom code. Tests show it rivals higher-precision training while using far fewer resources.

🛠️TOP TOOLS

AI Code Converter - AI-powered tool that offers code conversion, translation, and generation capabilities across over 50 programming languages.

Magical AI - AI writing assistant that integrates seamlessly with over 10 million apps, allowing users to draft emails, messages, and automate repetitive tasks directly from their browser.

DeepBrain AI - AI-powered video creation, featuring realistic AI avatars, natural text-to-speech capabilities, and advanced editing tools.

Claid AI - AI-powered photo enhancement platform designed specifically for e-commerce businesses to improve user-generated content and product imagery.

InstantArt - AI-powered platform that allows users to generate original artwork using over 25 fine-tuned stable diffusion models.

📲SOCIAL MEDIA

in a recent interview Demis Hassabis mentioned pairing the latest LLMs with "other techniques" and learning from the "self-improvement loops" they have seen with AlphaZero back in the day
I recently spoke with the ex-google guys on the @svicpodcast about this *exact* thing.
— Wes Roth (@WesRothMoney)
2:57 AM • May 26, 2025

What'd you think of today's edition?

Reply

or to participate.