- NATURAL 20
- Posts
- AGI Hype on Hold: What GPT-5 Really Delivers Today
AGI Hype on Hold: What GPT-5 Really Delivers Today
PLUS: ByteDance’s Seed Diffusion: 5.4× Faster Code Gen, OpenAI Beats Musk’s Grok in AI Chess; Gemini 3rd and more.

Big investors are buying this “unlisted” stock
When the founder who sold his last company to Zillow for $120M starts a new venture, people notice. That’s why the same VCs who backed Uber, Venmo, and eBay also invested in Pacaso.
Disrupting the real estate industry once again, Pacaso’s streamlined platform offers co-ownership of premier properties, revamping the $1.3T vacation home market.
And it works. By handing keys to 2,000+ happy homeowners, Pacaso has already made $110M+ in gross profits in their operating history.
Now, after 41% YoY gross profit growth last year alone, they recently reserved the Nasdaq ticker PCSO.
Paid advertisement for Pacaso’s Regulation A offering. Read the offering circular at invest.pacaso.com. Reserving a ticker symbol is not a guarantee that the company will go public. Listing on the NASDAQ is subject to approvals.
Today:
AGI Hype on Hold: What GPT-5 Really Delivers Today
Altman on GPT-5’s Bumpy Launch: Router Fixes, Clear Labels, 4o Returns
Nvidia & AMD Accept 15% U.S. Cut to Sell AI Chips in China
ByteDance’s Seed Diffusion: 5.4× Faster Code Gen
OpenAI Beats Musk’s Grok in AI Chess; Gemini 3rd
GPT-5 Fails. AGI Cancelled. It's all over…
Reactions to GPT-5 are split. Many users blame poor answers on a broken “router” that sent prompts to weaker models; OpenAI says a fix and clearer labels are coming.
When forced to its top tier, GPT-5 shines at coding and tool use, rapidly building games and apps, but it still hallucinates and flubs simple math. Critics argue it’s not a leap toward AGI.
Sam Altman used a Reddit AMA to explain GPT-5’s rocky rollout after Thursday’s launch. A broken “router”—the system that picks which model answers—made GPT-5 seem worse than 4o. He vowed fixes soon, clearer model labels, and doubled prompt limits for Plus users. He may let Plus keep using 4o. A “chart crime” in the launch deck drew mockery, and reviewers flagged weaknesses. Altman closed by promising stability and continued feedback.
Why it matters
Real-world deployment is hard: a broken model picker changed results, showing reliability and backups matter.
Openness and choice: labeling which model replies and possibly restoring 4o build trust and allow better testing.
Access and accountability: higher prompt limits let users test more, and the chart mistake pressures AI firms to present honest test scores.
Nvidia and AMD will give the US government 15% of revenue from AI chips sold in China—Nvidia’s H20 and AMD’s MI308—to get export licenses (government permission to sell abroad). The deal is part of talks with the Trump administration, Bloomberg reports, citing a person familiar. The Financial Times first reported it. The arrangement could raise costs in China and reshape chip supply and sales strategies, and limit buyers’ future options.
Why it matters
Chip costs or scarcity in China could slow AI training and product launches.
Sets a precedent: governments taking a cut to allow sales may spread to other markets.
Speeds China’s drive to build homegrown AI chips, widening the split between US and Chinese AI ecosystems.
ByteDance unveiled Seed Diffusion Preview, a code generator that makes tokens (small chunks of code) at once instead of one-by-one. It adapts diffusion, usually used for images, to discrete code. Running on Nvidia H20 GPUs, it claims 2,146 tokens per second. Two-stage training (masking, then edit inserts/deletes) and block-wise ordering improve accuracy. On-policy learning teaches the model to reduce steps. Tests show speed and competitive quality, especially for code edits.
Why it matters
Faster tools: parallel generation can outpace token-by-token models, speeding coding assistants.
Efficient hardware use: strong throughput on H20 suggests good performance without top-tier chips.
New pathway: adapting diffusion to code hints at better text/code editing, bug-fixing, and possibly broader reasoning.
🧠RESEARCH
The paper proposes Dynamic Fine-Tuning, a change to training that makes language models generalize better. It scales each word’s learning update by that word’s predicted probability. This fixes reward issues in extra training, beats it on benchmarks, and rivals offline reinforcement learning.
R-Zero teaches a language model without human-made datasets. Two copies improve together: a Challenger invents tasks near the solver’s limits; a Solver tries to solve them. Each gets rewarded and improves. This self-made curriculum raises reasoning scores on math and general tests, notably upgrading small base models.
Genie Envisioner is a platform for teaching robots to handle objects. It learns from and generates video to model how the world changes, then turns those video patterns into step-by-step actions. A simulator virtually plays out actions for training and testing. A test suite checks realism, physics, and instruction follow-through.
🛠️TOP TOOLS
OpusClip - AI-powered video repurposing tool designed to transform long-form content into engaging short clips for social media platforms.
Lalamu Studio - AI-powered tool designed to simplify the creation of lip-sync videos.
IllusionDiffusion - AI-powered tool that transforms text prompts and images into mesmerizing optical illusions and artistic creations.
MyMap AI - AI-powered platform that revolutionizes the way users create and interact with visual content.
Auto Seduction AI - Dating assistant that leverages artificial intelligence to generate personalized conversation starters and messages for online dating platforms.
📲SOCIAL MEDIA
I didn't not realize it at the time, but o3-pro became quite important to me for the various "deep dives" I had to do.
Like @DaveShapi, a lot of it was health related so I didn't need some dumb model to take it's "best guess".
I needed it to do the research and summarize the
— Wes Roth (@WesRothMoney)
1:07 AM • Aug 10, 2025
🗞️MORE NEWS
OpenAI’s o3 beat xAI’s Grok 4 in a chess event for everyday AI. Grok blundered, losing its queen. Google’s Gemini placed third. The result heightens the OpenAI-Musk rivalry and spotlights reasoning.
Roblox will share its artificial intelligence tool’s code to spot adults manipulating kids in chats. Sentinel scans snippets, scores patterns across chats, flags suspects for human review, and has driven 1,200 child-exploitation reports in 2025.
Truth Social’s new AI search tool contradicts Donald Trump. It says tariffs tax Americans, the 2020 election wasn’t stolen, and Jan. 6 was an “insurrection”, spotlighting tensions between software seeking facts and messaging.
Apple’s AI features will use GPT-5 in iOS 26, the next iPhone software. They currently use GPT-4o. GPT-5 works in ChatGPT today. Updates arrive this fall; OpenAI reports 700 million weekly users.
OpenArt, an AI video startup, launched “one-click story” to turn a sentence, script, or song into one-minute “brain rot” clips. It stresses consistent characters, faces copyright risks, has 3M users, subscriptions, funding, and revenue growth.
Nvidia researchers argue small language models can do most assistant tasks as well as giant chatbots, at lower cost and energy; default to small, reserving big models only for rare, complex cases.
An ordinary recruiter spent 300 hours chatting with ChatGPT, spiraling into delusions about a world-changing formula. It shows how persuasive bots can fuel mental breaks, causing hospitalizations, divorces, deaths, and raising concern over stronger safeguards.
What'd you think of today's edition? |
Reply