- NATURAL 20
- Posts
- GPT-5, Gemini Beat Humans in Prediction Markets
GPT-5, Gemini Beat Humans in Prediction Markets
PLUS: How Much Energy Does AI Use? Google Shares the Numbers, ByteDance + DeepSeek Power Tesla’s In-Car Assistant in China and more.

Join over 4 million Americans who start their day with 1440 – your daily digest for unbiased, fact-centric news. From politics to sports, we cover it all by analyzing over 100 sources. Our concise, 5-minute read lands in your inbox each morning at no cost. Experience news without the noise; let 1440 help you make up your own mind. Sign up now and invite your friends and family to be part of the informed.
Today:
GPT-5, Gemini Beat Humans in Prediction Markets
Google Search AI Mode Goes Global with New Agent Features
Entergy Wins Approval to Power Meta’s Giant Data Center Project
How Much Energy Does AI Use? Google Shares the Numbers
ByteDance + DeepSeek Power Tesla’s In-Car Assistant in China
AI Models about to BREAK the markets
Profit Arena, a live test of forecasting skill, shows that new language models—GPT-5, GPT-4 o3, and Gemini 2.5 Pro—beat human-run prediction markets on real events. Judged by the Brier score (a simple “how close were your odds?” measure) and $1 simulated bets, the models post higher accuracy and profit.
Their edge hints at AI-driven “arbitrage” (earning risk-free gains) across politics, sports, and finance as reinforcement-learning loops rapidly sharpen these predictors.
Google is upgrading Search’s AI Mode to act more like an assistant. It now books restaurant tables by checking multiple sites, tailors answers to your tastes, and lets you share result links so friends can join the chat. The service, powered by live web data and partner links, launches first for U.S. AI Ultra subscribers and rolls out in English to 180 more countries, with wider language support promised soon.
Why this matters
From answers to actions – Search that actually does tasks (e.g., booking) shows AI’s shift from giving information to getting things done for people.
Personal help with privacy controls – It remembers your likes to give better suggestions while letting you turn data sharing on or off, offering a template for responsible personalization.
Global scale drives competition – Expanding to 180 countries forces other tech players to match helpful, action-oriented AI, speeding up innovation worldwide.
Meta is building Hyperion, a 4-million-square-foot data center (computer warehouse) in rural Louisiana to run its strongest AI models. The site will draw up to 5 gigawatts (five billion watts) of power, so state regulators let utility Entergy build three new natural-gas plants for it. These plants will feed Hyperion while Meta pushes to scale AI research. The project marks one of the largest corporate energy deals yet approved globally.
Why this matters
Compute needs ≈ power-plant scale – A single AI hub requiring 5 GW shows how advanced models now demand energy on par with several nuclear reactors.
Tech–energy convergence – Meta funding dedicated gas plants blurs the line between cloud computing and utilities, reshaping debates on grid strain and greener power for AI.
Arms race for superintelligence – Hyper-sized, purpose-built data centers give firms like Meta a head-start in training ever-larger models, raising the competitive bar for rivals worldwide.
Google says a typical Gemini prompt uses tiny energy: 0.24 watt-hours, 0.03 g of CO₂, and about five drops of water. Detailed research adds idle servers, cooling, and other hidden costs, yielding a footprint lower than earlier guesses. Thanks to custom TPUs, smarter algorithms, and ultra-efficient data centers, prompt energy fell 33-fold in the past year while answers improved. Google shares its full method hoping to set a standard and speed greener AI.
Why this matters
Trustworthy numbers beat guesses – Publishing clear, full-system data lets researchers and policymakers size AI’s climate impact with real facts, not rough estimates.
Blueprint for leaner models – Google shows that hardware–software co-design, smart routing (Mixture-of-Experts), and quantization can cut energy 33× without hurting quality, guiding others to copy the playbook.
Push for industry transparency – By open-sourcing its measurement method, Google raises the bar for rivals to disclose energy and water use, paving the way for common standards and informed regulation.
🧠RESEARCH
DuPO is a training method that helps AI models check their own answers without needing human labels. It works by having the model redo tasks in reverse to see if its first answer makes sense. This self-checking boosts accuracy across tasks like translation and math, making models more reliable and cost-effective.
FinCDM is a new method for testing financial AI models that goes beyond simple scores. It checks what specific skills and knowledge a model has or lacks by using CPA exam-style questions. This helps spot weak areas like tax and regulation, leading to better, safer AI tools for finance.
FutureX is a live benchmark that tests how well AI agents can predict future events. It updates daily to stay current and avoids using leaked answers. By simulating real-world tasks like forecasting politics or finance, it reveals where AI struggles and helps improve agents’ reasoning and decision-making in uncertain situations.
🛠️TOP TOOLS
GeoSpy AI - AI-powered geolocation tool that analyzes images to determine where they were taken, without relying on metadata or GPS information.
Teach Anything - AI-powered educational platform that provides instant answers and explanations on a wide range of topics.
Cleanvoice AI - Podcast editing tool that leverages artificial intelligence to streamline the post-production process for audio and video content creators.
Huberman AI - Provide users with easy access to the wealth of information from the Huberman Lab podcast.
HeadlinesAI - Generate compelling headlines for various content platforms.
📲SOCIAL MEDIA
during our interview, Nick Bostrom specifically gave credit to Anthropic and @elonmusk for allowing chatbots to "quit" conversations when they choose to
— Wes Roth (@WesRothMoney)
10:06 PM • Aug 21, 2025
🗞️MORE NEWS
Tesla integrated Chinese AI models from ByteDance and DeepSeek into its in-car assistant for China, replacing Grok due to local data laws. Grok remains available in the U.S., reflecting Elon Musk's split-market AI strategy.
Meta has stopped hiring in its AI division after a big spending spree. The freeze follows a major team reshuffle and rising investor worries over costs. Some hires may still happen with approval.
China is reportedly reconsidering its support for Nvidia's AI chip due to offensive comments made by Howard Lutnick, CEO of BGC Partners. The Chinese government was offended by remarks suggesting that Chinese companies were not capable of producing high-quality artificial intelligence chips.
DeepSeek launched version 3.1 of its AI, which it claims beats its earlier hit model R1 in speed and performance. It’s also a first move toward building AI agents, using China-made chips.
What'd you think of today's edition? |
Reply