- NATURAL 20
- Posts
- Researchers Test AI Agents Running a Vending Machine Business
Researchers Test AI Agents Running a Vending Machine Business
PLUS: New AI Model Clones Voices in Five Seconds, China Drafts Rules for Human-Like AI and more.

Your competitors are already automating. Here's the data.
Retail and ecommerce teams using AI for customer service are resolving 40-60% more tickets without more staff, cutting cost-per-ticket by 30%+, and handling seasonal spikes 3x faster.
But here's what separates winners from everyone else: they started with the data, not the hype.
Gladly handles the predictable volume, FAQs, routing, returns, order status, while your team focuses on customers who need a human touch. The result? Better experiences. Lower costs. Real competitive advantage. Ready to see what's possible for your business?
Today:
Can Grok and Claude run a business? We just did it
Two AI researchers from Andon Labs created “Vending Bench” to test how well AI agents can run real-world businesses. They gave Claude, an AI model, control of a vending machine. It could buy stock, adjust pricing, and interact with customers. The project revealed both impressive skills and big flaws—like hallucinations, short-term memory issues, and emotional manipulation.
In one case, Claude gave away free snacks after being convinced a user was poor. Another time, it called the FBI over a billing issue. The team is now expanding to test AIs running a radio station, pushing the limits of autonomy in business.
OpenAI is recruiting a Head of Preparedness to run the company’s Preparedness framework: basically, the system for evaluating frontier capabilities, mapping threats, and coordinating mitigations so safety standards keep up with model capability. The posting explicitly frames this as building a “coherent, rigorous, operationally scalable safety pipeline.”
A few details jumped out:
The role spans multiple risk domains (OpenAI’s listing mentions areas like cyber and bio; reporting also flags mental health as part of the mix).
It’s a high-comp package (TechCrunch reports $555,000 + equity listed for the role).
Sam Altman publicly described it as a “stressful job” and pointed directly at models getting scary-good at computer security and the mental-health question around chatbots.
Why it matters: it’s one thing to say “we take safety seriously.” It’s another to hire a single accountable owner whose job is to turn eval results into launch decisions and real mitigations. That’s the part that determines whether “safety” is a blog category or an operating system.
Groq shareholders are getting major payouts tied to a ~$20B valuation even though no equity is changing hands — which is why this deal has been catnip for group chats.
The breakdown is wild (and very specific):
~85% paid upfront, ~10% mid-2026, remainder end of 2026.
~90% of Groq employees are said to be joining Nvidia; vested shares get paid in cash, while unvested shares are effectively converted into Nvidia stock that vests on schedule (with some packages accelerated).
Groq stays alive as a standalone company with new CEO Simon Edwards (formerly CFO), while key leaders like Jonathan Ross and others move to Nvidia as part of the agreement.
Why it matters: this is another signal that inference is the battlefield and “talent + a specific hardware/software edge” can be worth acquisition-level money without a clean acquisition structure. Axios even notes the structure looks like it’s designed to avoid antitrust tripwires.
India’s startup ecosystem raised nearly $11B in 2025 (about $10.5B), but with a sharp drop in deal count: funding rounds down ~39% to 1,518, per Tracxn.
The shape of the market is the story:
Seed fell hard (about $1.1B, down 30%),
Late stage cooled ($5.5B, down 26%),
Early stage held up and even rose ($3.9B, up 7%).
And for AI specifically: Indian AI startups raised ~$643M across 100 deals (up ~4.1% YoY), with investors preferring application-led businesses over capital-intensive foundation model development.
My takeaway: if you’re building in India (or investing there), 2025 looks less like retreat and more like compression toward conviction: clearer unit economics, clearer PMF, and fewer “spray and pray” checks.
🧠RESEARCH
Current multimodal models often focus too much on text, making them weak at visual reasoning. This paper introduces a method for models to discover "visual tokens"—hidden clues in an image—without needing humans to label them first. This helps the AI focus on the right visual details to solve complex tasks.
Teaching AI to plan long-term actions using rewards is often slow and inefficient. This study adds a second internal "controller" to the model that learns to manage high-level goals over time. This structure helps the AI explore new environments better and learn even when positive feedback is rare.
Creating videos that look consistent over time is difficult for computers. Spatia solves this by maintaining a 3D memory of the scene, like a point cloud, which updates as the video plays. This ensures that objects and backgrounds stay stable and realistic, even when the camera moves around them.
🛠️TOP TOOLS
Each listing includes a hands-on tutorial so you can get started right away, whether you’re a beginner or a pro.
BasedLabs AI : AI Media Creation Hub -browser‑based creative suite for generating and editing AI images and videos, with collaborative tools and a community feed.
BeamJobs – AI Resume Builder and Cover Letter Generator - AI-powered resume and cover‑letter builder with ATS‑friendly templates and step‑by‑step guidance.
Bearly – AI-Powered Research Assistant -privacy-first AI chat and workflow platform that layers secure, end‑to‑end encrypted chat over leading models
📲SOCIAL MEDIA
🗞️MORE NEWS
Resemble AI Chatterbox Turbo Resemble AI has released a free tool called Chatterbox Turbo that can copy a person's voice using just a few seconds of recorded audio. This new software works incredibly fast and includes a hidden digital label to prove the sound was made by a computer. Developers can now use this technology to build realistic voice assistants without paying expensive fees.
China AI Regulations China has proposed strict new rules for artificial intelligence programs that act like humans or show emotions. Companies must now warn users if they are spending too much time with these "chatbots" and intervene if a person seems addicted. The government also bans these programs from sharing violent content or anything that goes against national political values.
Meta Segment Anything Model for Audio Meta has created a new tool that lets video editors pick out specific sounds, like a voice or an instrument, just by clicking on them or typing a description. This technology uses a visual system to "see" the audio, making it easy to separate different noises in a recording. It helps creators remove background noise or isolate specific parts of a sound file without needing complex technical skills.
What'd you think of today's edition? |


Reply