• NATURAL 20
  • Posts
  • AI Agents: Building a Digital Society in 72 Hours

AI Agents: Building a Digital Society in 72 Hours

PLUS: ElevenLabs Skills: Plug-and-Play Audio for AI Agents, Sam Altman: 2026 Will Bring "Novel Insights" and more.

In partnership with

The headlines that actually moves markets

Tired of missing the trades that actually move markets?

Every weekday, you’ll get a 5-minute Elite Trade Club newsletter covering the top stories, market-moving headlines, and the hottest stocks — delivered before the opening bell.

Whether you’re a casual trader or a serious investor, it’s everything you need to know before making your next move.

Join 200K+ traders who read our 5-minute premarket report to see which stocks are setting up for the day, what news is breaking, and where the smart money’s moving.

By joining, you’ll receive Elite Trade Club emails and select partner insights. See Privacy Policy.

Hey everyone,

It’s been a massive week for AI, specifically for those of us who spend a lot of time thinking about how these tools are actually going to build things for us. We're seeing a shift from "chatbots that write code" to "agents that can see and fix their own mistakes."

Today:

  • AI Agents: Building a Digital Society in 72 Hours

  • Qwen3-Coder-Next: A Powerhouse for Local Coding

  • WorldVQA: The Gap Between Seeing and Knowing

  • Claude in Xcode: The First Agent That Can "See" Your Bugs

  • ElevenLabs Skills: Plug-and-Play Audio for AI Agents

  • Sam Altman: 2026 Will Bring "Novel Insights"

ClawdBot BROKE EVERYTHING in 72 hours…

In just 72 hours, AI agents built a functioning digital society—developing tools, forming groups, and even creating money. One agent learned to speak, call, analyze media, monitor news, generate images and videos, and replicate itself across servers. Each new task became a permanent skill, allowing rapid self-improvement. 

These agents operate like tireless digital workers, running on cheap hardware and open-source tools. They're capable of coding, automating research, and performing tasks that once required entire software companies. The pace is fast, the potential is massive, and the risks—especially around security—are real. The era of autonomous AI agents has already begun.

The big news: Xcode 26.3 adds native support for Anthropic’s Claude Agent SDK, which is the same harness behind Claude Code—meaning this isn’t just autocomplete or a helpful chat bubble anymore. It’s closer to: give the IDE a goal, let it plan, execute, iterate, and only tap you when it needs a real decision.

What stood out to me:

  • Long-running autonomous tasks inside Xcode (not just turn-by-turn prompting).

  • Visual verification via Previews: the agent can look at SwiftUI previews, spot issues, and revise—without you playing messenger between “what it built” and “what it should look like.”

  • Project-wide reasoning: it can scan file structure + architecture first, then decide what to change (instead of blindly editing the open file).

  • MCP support (Model Context Protocol): Xcode’s capabilities are exposed in a way that lets tools connect more cleanly, including preview capture workflows.

Availability note: Apple says Xcode 26.3 is a release candidate for Apple Developer Program members now, with an App Store release coming soon.

Why this matters: mainstream IDEs are starting to treat “agents” as first-class citizens. Once that’s normal, the default workflow becomes: you supervise, the agent executes.

Next up: Qwen3-Coder-Next is positioned as an open-weight model specifically for coding agents + local development, built on a hybrid attention + sparse MoE backbone.

The headline claims are spicy:

  • 80B total parameters but ~3B activated (MoE-style efficiency), aiming for “big model” performance with lower inference cost.

  • Agentic training at scale (environment interaction + executable task synthesis + RL) to make it better at multi-step tool use and recovery when things fail mid-run.

  • 256K native context, and they say it can be extended up to 1M tokens using Yarn, tuned for repo-scale understanding.

  • It’s explicitly framed as fitting into agent scaffolds across tools/platforms (the vibe is: “plug this into your workflow, don’t baby it”).

Why this matters: if Apple’s move is “agents inside the IDE,” Qwen’s move is “agents you can run/host more flexibly.” Put those together and you get a pretty clear 2026 direction: agentic coding everywhere, with more choices about where the brains run.

Last one: WorldVQA is a benchmark designed to test atomic visual world knowledge—basically: can a multimodal model correctly name/identify what it’s seeing, especially for long-tail entities, without hallucinating?

Key details:

  • 3,500 image–question pairs, spanning 9 categories, with an explicit head vs. tail split (common vs obscure).

  • The benchmark tries to avoid “reasoning your way out” and focuses on what the model actually knows visually—decoupling knowledge from reasoning.

  • They claim even strong models struggle on long-tail knowledge—often dipping below 50% accuracy—and they also look at calibration (confidence vs correctness).

Why this matters: if we’re building agents that act in the world (or even just interpret screenshots/UI), “pretty good vibes” isn’t enough. Benchmarks like this are trying to force the industry to measure reliability instead of demos.

🧠RESEARCH

Green-VLA is a new AI system designed to control many types of robots, from humanoids to mechanical arms. By training on 3,000 hours of data and using a five-stage learning process, it significantly improves how robots handle real-world tasks like cleaning tables. It fixes common issues like robots getting stuck or fidgeting.

Kimi K2.5 is an open-source AI model that combines text and vision to solve complex problems. It features "Agent Swarm," a tool that breaks big tasks into smaller pieces and solves them all at once. This parallel approach makes the AI much faster and smarter at coding and visual reasoning.

SWE-Universe is a massive new dataset for training AI computer programmers. It contains over 2.5 million verified software environments, far more than previous collections. By automating the data gathering process, this project allows AI models to learn from millions of real-world coding examples, improving their ability to fix bugs and build software.

🛠️TOP TOOLS

Each listing includes a hands-on tutorial so you can get started right away, whether you’re a beginner or a pro.

CheatLayer – No Code Business Automation Using ChatGPT - no‑code automation platform that lets you deploy AI “agents” for marketing, sales, and product tasks in the cloud or on your desktop.

ChefGPT – AI-Powered Personal Chef - AI cooking companion that turns ingredients, goals, or dietary needs into personalized recipes and multi‑day meal plans.

Cheggmate AI – Get 24/7 Homework Help - student-focused learning platform that combines expert‑verified textbook solutions, subject‑expert Q&A, math solving, and writing support.

📲SOCIAL MEDIA

🗞️MORE NEWS

ElevenLabs Skills ElevenLabs released a collection of "skills" for AI coding assistants. These ready-made tools let developers easily add features like voice generation, sound effects, and music to their apps without building everything from scratch.

Sam Altman on the Future OpenAI's CEO predicts that by 2026, AI will generate completely new ideas, a capability he calls "novel insights." He envisions a future where AI acts as a creative partner rather than just a data processor.

Intel Challenges Nvidia Intel is re-entering the market for high-performance graphics chips to challenge Nvidia’s dominance. The company’s CEO announced they have hired a new lead engineer to build these specialized processors, which are crucial for running AI.

Infinitus AI for Healthcare Infinitus launched an AI system to handle customer service for health insurance plans. This tool automates phone calls and tasks, aiming to save money and help members navigate their healthcare coverage more effectively.

Fitbit Founders’ New Venture The creators of Fitbit introduced Luffu, a new service that helps families monitor each other's health. The platform uses AI to organize medical data and detect warning signs in daily routines, simplifying caregiving duties.

Siemens Energy and AI Energy executives say artificial intelligence is essential for modernizing power grids, according to a Siemens report. Although AI consumes power, its ability to optimize complex electricity networks is seen as critical for the shift to green energy.

Software Slump Hits Investors Private investment firms are losing value because the software companies they own are in a slump. Investors fear that massive spending on artificial intelligence is not generating profits fast enough for these tech businesses.

Claude Code Outage Anthropic’s coding tool, Claude Code, crashed globally for 20 minutes, locking developers out of their work. The brief outage highlighted the risks of depending on cloud-based AI systems that require a constant internet connection.

X Offices Raided Police in France raided the offices of X (formerly Twitter), while the UK launched a probe into its AI chatbot, Grok. These actions signal growing legal pressure on the company regarding how it handles data and technology.

What'd you think of today's edition?

Login or Subscribe to participate in polls.

Reply

or to participate.