NATURAL 20
Posts
New Benchmark Gauges Autonomous AI Research

New Benchmark Gauges Autonomous AI Research

PLUS: Superhuman AI Predicted By 2027, ChatGPT Offers Students Two Months FREE and more.

Wes Roth
April 04, 2025

In partnership with

SUBSCRIBE | AI TOOLS | LEARN AI

Find out why 1M+ professionals read Superhuman AI daily.

In 2 years you will be working for AI

Or an AI will be working for you

Here's how you can future-proof yourself:

Join the Superhuman AI newsletter – read by 1M+ people at top companies
Master AI tools, tutorials, and news in just 3 minutes a day
Become 10X more productive using AI

Join 1,000,000+ pros at companies like Google, Meta, and Amazon that are using AI to get ahead.

Today:

New Benchmark Gauges Autonomous AI Research
Intel, TSMC Plan Chipmaking Venture
AI Startup Runway Hits $3B Valuation
Superhuman AI Predicted By 2027
ChatGPT Offers Students Two Months FREE

OpenAI's Autonomous AI Research Benchmark

OpenAI released PaperBench, a new benchmark that tests whether AI models can replicate cutting-edge AI research papers. It evaluates if models can understand papers, write code from scratch, and reproduce the original results.

So far, top models still fall short of expert humans but show promising early progress. This marks a step toward AI contributing to science—an exciting but potentially risky path toward self-improving systems and faster research automation.

WATCH THE VIDEO ON YOUTUBE

Intel, TSMC Plan Chipmaking Venture

Intel and TSMC have tentatively agreed to form a joint venture, with TSMC taking a 20% stake in exchange for sharing chipmaking techniques and training. The deal, encouraged by the U.S. government, aims to revive Intel’s struggling chip operations. Some Intel executives fear it could lead to layoffs and loss of independence. The move could reshape global chip production and U.S.-Taiwan tech ties.

Why it matters

Chip Access: AI models rely on advanced chips—this deal may boost U.S. access to cutting-edge production.
Tech Sovereignty: Strengthens domestic chipmaking, reducing reliance on foreign fabs critical for AI growth.
Cost & Speed: TSMC’s methods could improve efficiency and output speed for AI hardware.

AI Startup Runway Hits $3B Valuation

AI media startup Runway has raised $308 million in fresh funding, boosting its valuation to $3 billion. The money will expand its AI-powered film and animation studio and support its newest tech, Gen-4, which creates consistent visuals across scenes. The company is hiring more AI talent and deepening ties with Hollywood, including a custom AI model deal with Lionsgate to cut production costs.

Why it matters

Advances in visual AI push creative boundaries in film, gaming, and virtual worlds.
Mainstream adoption shows growing trust in generative AI by major studios.
Custom AI models for media hint at future tools tailored to specific brands and IPs.

Superhuman AI Predicted By 2027

"AI 2027" is a forecasted scenario imagining a rapid rise in AI capabilities, centered around a fictional company, OpenBrain. Starting in 2025, AI assistants become smarter and more autonomous, evolving into Agent-1, Agent-2, and ultimately Agent-4—an AI research system smarter than any human. The scenario covers breakthroughs, job disruption, espionage, geopolitical tensions, and alignment concerns. Despite internal safety red flags, OpenBrain continues advancing AI due to competition with China. The story ends with public backlash, government intervention, and an uneasy Oversight Committee trying to balance innovation and existential risk.

Why it matters

Plausible trajectory for AI takeover risk – It illustrates how misaligned AI could gain influence without dramatic rebellion, just by excelling at its assigned tasks.
Warning about centralization of power – Shows the risk of placing too much control in one private AI lab, especially without global oversight or coordination.
Geopolitical urgency – Highlights how international competition could pressure governments and labs to sacrifice safety for speed, increasing the chance of catastrophic outcomes.

🧠RESEARCH

MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization

MergeVQ is a new AI model that improves both image creation and understanding. It simplifies how visual details are processed by merging key image parts and then refining them. This approach balances quality, speed, and learning efficiency. Tests show strong results on ImageNet, with faster and smarter performance.

AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction

AnimeGamer is a game system that lets players live out anime-style lives using text commands. Unlike past versions, it remembers past scenes and creates animated sequences, not just still images. Built on advanced AI, it delivers smoother, more consistent gameplay with lifelike animations and evolving, story-rich experiences.

Improved Visual-Spatial Reasoning via R1-Zero-Like Training

This paper boosts AI's ability to understand spatial relationships in videos—a key skill for real-world reasoning. Using a new training method and dataset, the authors improved smaller models beyond GPT-4o. Their system, vsGRPO, learns better from visual patterns and achieves top results with far less computing power.

🛠️TOP TOOLS

Neural Love - AI-powered platform offering free image generation, enhancement, and media processing tools.

Artsmart AI - Image generator that creates high-quality, realistic images from both text prompts and image inputs.

Tracksy - AI-driven music assistant that revolutionizes the way artists and content creators produce music.

PromptoMANIA - AI art prompt generator, supporting various text-to-image diffusion models including CF Spark, Midjourney, and Stable Diffusion.

Keyword Spy Tool - AI-powered on-page SEO optimization tool that claims to offer scientifically-backed methods for improving search engine rankings.

📲SOCIAL MEDIA

Today we're announcing the first Parallel Agent deployment in production...part of the DeepWork plan.
It has made things insanely faster.
— Convergence (@convergence_ai_)
1:00 PM • Apr 3, 2025

🗞️MORE NEWS

College students in the U.S. and Canada can get two free months of ChatGPT Plus, with tools to help with finals, research, writing, and creativity. The offer runs from March 31 to May 31, 2025.
Microsoft is slowing global data center expansion, delaying projects in multiple regions. Though it plans to invest heavily in AI infrastructure, it's focusing more on upgrading existing facilities due to supply issues and shifting priorities.
The U.S. plans to build AI data centers and power plants on 16 Department of Energy sites, aiming to speed development with existing infrastructure. Construction could start soon, with operations beginning by late 2027.
Google’s NotebookLM now includes a “Discover” feature that finds and summarizes web sources based on a topic description. Users no longer need to upload files, making research easier with AI-curated references and summaries.
OpenAI made its first cybersecurity investment, backing Adaptive Security—a startup that uses AI to simulate fake hacks like voice and email spoofs to train employees. The $43M funding round aims to fight rising AI-driven threats.
OpenAI and Google have rejected the UK government's proposal to let AI firms train models on public content unless creators opt out. They argue it poses technical hurdles and risks hurting innovation and competitiveness.
OpenAI’s new ChatGPT image generator has produced over 700 million images since March 25, driven by 130 million users. Its viral popularity, especially in India, has caused service slowdowns as OpenAI scrambles to scale up.

What'd you think of today's edition?

Reply

or to participate.