• NATURAL 20
  • Posts
  • New Claude Goes Marathon Mode With 30-Hour Coding Sprint

New Claude Goes Marathon Mode With 30-Hour Coding Sprint

Anthropic’s surprise upgrade, Claude Sonnet 4.5, just set a new bar for AI agents. 

In an internal test the model ran autonomously for 30 hours, producing a fully-functional Slack-style chat app with 11,000 lines of code.

It’s ahead of schedule:

Details:

Benchmark results confirm the jump: Claude 4.5 tops SWE-Bench-Verified for real-world GitHub fixes, smashes OpenAI’s computer-use preview with 61 percent task completion in OS-World, and leads agentic coding, terminal automation, and tool-use leaderboards. 

Two technical advances drive the surge. First, a context-management layer compresses dialogue so the agent remembers days of work without exceeding its window. Second, a Chrome extension lets Claude click, type, and submit forms across Google Docs, Sheets, and Gmail, turning it into a hands-on digital assistant. 

A research preview dubbed “Imagine with Claude” goes further, live-rendering software and mini-games on demand instead of writing code, foreshadowing on-the-fly app generation. Apollo Research also rates the release Anthropic’s safest yet, citing reduced deceptive behavior. 

Enterprises from Netflix to Thomson Reuters report double-digit productivity gains, but Anthropic warns the technology may displace entry-level white-collar roles. 

Claude 4.5 shows the agent era isn’t coming…it’s already here.

Reply

or to participate.