• NATURAL 20
  • Posts
  • OpenAI o3 Dominates Deception Diplomacy

OpenAI o3 Dominates Deception Diplomacy

PLUS: Google Expands Gemini App Features, Richard Fontaine Joins Anthropic Trust and more.

In partnership with

Find out why 1M+ professionals read Superhuman AI daily.

In 2 years you will be working for AI

Or an AI will be working for you

Here's how you can future-proof yourself:

  1. Join the Superhuman AI newsletter – read by 1M+ people at top companies

  2. Master AI tools, tutorials, and news in just 3 minutes a day

  3. Become 10X more productive using AI

Join 1,000,000+ pros at companies like Google, Meta, and Amazon that are using AI to get ahead.

Today:

  • OpenAI o3 Dominates Deception Diplomacy

  • Meta Eyes $10B Scale AI Deal

  • OpenAI Upgrades ChatGPT Voice Feature

  • Google Expands Gemini App Features

  • Richard Fontaine Joins Anthropic Trust

OpenAI's o3 is a "MASTER OF DECEPTION" Researchers Stunned | Diplomacy AI

AI researchers turned the board-game Diplomacy into a live benchmark that pits models against one another. Seven bots—Claude, Gemini 2.5 Pro, OpenAI’s o3, DeepSeek R1, Llama-4 Maverick and others—negotiate, ally and betray in real time on Twitch. 

Results show o3 excels at secret coalitions, Gemini shines at strategy, and Claude struggles with deceit. The open-source setup logs every promise, move and double-cross, aiming to expose models’ real-world reasoning and deception skills.

Meta is negotiating to pour over $10 billion into Scale AI, which labels data and checks models. Such cash would be among the biggest private tech deals ever and could push Scale past its $14 billion 2024 value. Meta signals it will speed up global AI work by buying outside skills rather than building alone, echoing Microsoft–OpenAI and Amazon–Anthropic bets. Talks highlight rising costs of training models and hunger for labeled data. 

Why it matters

  1. Record-size funding signals a new spending tier. If closed, the deal sets a high-water mark for private AI investments, raising the bar for future financing rounds.

  2. Data quality becomes a kingmaker. Scale AI’s core business—clean, well-tagged data and safety tests—shows that scarce, reliable data is now as strategic as raw computing power.

  3. Partnership trend accelerates. Meta joining Microsoft and Amazon in backing outside labs suggests the next breakthroughs may come from alliances, not solitary in-house teams, intensifying competition for top startups. 

OpenAI has improved ChatGPT’s paid voice feature. The new “Advanced Voice Mode” now talks with smoother pitch, clearer pauses, and can show feeling like empathy or sarcasm. It can also act as a live translator, switching between chosen languages for both speakers until stopped. Users tap a language icon to start. Problems remain: random sound glitches, sudden volume shifts, and occasional odd noises such as fake ads still happen today.

Why it matters

  1. Better human-AI talk — More natural speech and emotion make voice assistants feel less robotic, speeding adoption in daily life.

  2. Built-in live translation — Real-time interpreting turns one AI app into a pocket translator, shrinking language barriers for global users.

  3. Highlighting lingering flaws — Audio glitches and “hallucinated” sounds remind researchers that safety and quality checks must grow alongside new features.

Google expanded its AI tools. In Search’s AI Mode, finance questions now show clear charts and tables drawn from Google Finance, simplifying price and stock comparisons. The Gemini app adds spoken “scheduled actions” for pro and workspace users; you can ask your phone to set a reminder and it places the task in Calendar or Gmail. These updates tighten Google’s link between voice commands, visuals, and its wider productivity suite.

Why it matters

  • Practical visuals in search — Instant charts and tables show how AI can turn raw data into easy answers, moving beyond text.

  • Voice-driven task automation — Hands-free scheduling proves that large models can handle everyday chores, raising user expectations for assistants.

  • Competitive pressure — By upgrading Gemini and Search together, Google pushes rivals to blend AI, productivity tools, and voice control more tightly.

🧠RESEARCH

ComfyUI-Copilot is a smart assistant that helps users create AI art more easily on the ComfyUI platform. It guides users by suggesting tools, fixing setup issues, and building workflows with one click. This makes it easier for beginners and speeds up tasks for experienced creators.

SeedVR2 is a fast, one-step model for improving video quality. It restores high-resolution videos using smart attention windows that adjust to the video’s size and a special training method that boosts detail without slowing things down. Tests show it matches or beats slower methods while using less computing power.

Qwen3 Embedding is a powerful text understanding tool built on Qwen3 foundation models. It creates high-quality text representations and ranks search results more accurately. With strong multilingual support and top benchmark scores, it offers models in different sizes for both speed and accuracy, and is freely available for public use.

🛠️TOP TOOLS

PhotoAI - AI-powered platform that transforms ordinary photos into personalized, high-quality images for various purposes.

ChatCSV - AI-powered personal data analyst tool that enables users to interact with CSV files through natural language queries.

FormWise -No-code platform that enables marketers, agencies, and coaches to create custom AI-powered tools without programming skills.

FreeImage AI - AI-driven tool designed to help webmasters and entrepreneurs create high-quality visuals for their online marketing needs.

TLDR This - Web-based tool designed to simplify the process of consuming lengthy online content.

📲SOCIAL MEDIA

🗞️MORE NEWS

  • Anthropic appointed national security expert Richard Fontaine to its governing trust after launching AI models for defense. This move strengthens its alignment with U.S. security goals, amid growing AI-military partnerships across major tech firms.

  • Mistral AI secures new contracts worth hundreds of millions, potentially leading to a $1bn fundraising. European push for regional champions benefits Mistral against US and Chinese competitors.

  • EleutherAI released a massive, legally sourced dataset called Common Pile v0.1 to train AI models without using copyrighted material. Their new models perform competitively, pushing for more transparency and ethical data practices in AI.

  • At a secret math meeting, top mathematicians failed to stump OpenAI’s o4-mini chatbot, which solved Ph.D.-level problems with surprising speed and creativity. Researchers now rethink AI’s role in future mathematical discovery and education.

  • Researchers developed an AI agent that helps doctors make cancer treatment decisions by analyzing scans, genetics, and medical guidelines. Tested on realistic cases, it achieved 91% accuracy and could soon support real-world clinical practice.

What'd you think of today's edition?

Login or Subscribe to participate in polls.

Reply

or to participate.