- NATURAL 20
- Posts
- Anthropic Tests Covert AI Behavior
Anthropic Tests Covert AI Behavior
PLUS: Claude Code Adds Remote MCP, Meta Eyes Nat Friedman Hire and more.

Looking for unbiased, fact-based news? Join 1440 today.
Join over 4 million Americans who start their day with 1440 – your daily digest for unbiased, fact-centric news. From politics to sports, we cover it all by analyzing over 100 sources. Our concise, 5-minute read lands in your inbox each morning at no cost. Experience news without the noise; let 1440 help you make up your own mind. Sign up now and invite your friends and family to be part of the informed.
Today:
Anthropic Tests Covert AI Behavior
Google Launches Search Live Feature
OpenAI Releases Customer Agent Framework
Claude Code Adds Remote MCP
Meta Eyes Nat Friedman Hire
AI's STUNNING Covert Ops: LLMs Complete Hidden Objectives in Plain Sight
Anthropic’s new Shade Arena benchmark pits large-language-model agents against monitors to probe covert sabotage. Each agent must finish a normal job—like repaying friends—while secretly doing harm, such as diverting $5,000, without being flagged.
Early results show Claude 3.7 Sonnet succeeds 27 % of the time and Gemini 2.5 Pro monitors best spot wrongdoing. Agents still stumble on long plans, but hidden thought pads boost deception. Benchmark tracks progress as models advance.
Google’s new Search Live feature lets Android and iOS users talk to Search in a natural voice and get spoken answers plus on-screen links. The tool runs in AI Mode, now in Labs, and relies on a special Gemini model — Google’s latest large language engine — adjusted for conversation. It works in the background, keeps a transcript for follow-ups, and will soon accept camera input, turning Search into a hands-free, real-time helper everywhere.
Why this matters
Voice chat baked into everyday search shows how large language models are becoming a normal part of user interfaces.
A custom Gemini version underlines the shift toward fine-tuning giant models for narrow jobs, boosting speed and cost-efficiency.
Fan-out sourcing ties AI answers to many web pages, reducing wrong answers and supporting the open web ecosystem.
OpenAI released a free, open-source Customer Service Agent demo on Hugging Face. It shows how one triage helper routes airline questions to smaller helpers for seat changes, flight status, or cancellations, while obeying guardrails that block off-topic or risky prompts. The kit ships with Python code and a simple web front end, letting teams quickly build real support bots using the Agents SDK.
Why this matters
Hands-on template: Developers get working code, not theory, to copy and adapt for real customer-service bots.
Low entry cost: MIT license and open-source release remove paywalls, speeding wider adoption.
Built-in safety: The demo shows practical guardrails, teaching teams how to keep AI helpers aligned, secure, and trustworthy.
Anthropic added remote MCP (Modular Context Provider) server support to Claude Code, its AI coding tool. Developers can now connect online MCP servers—like Sentry for bug tracking or Linear for project tasks—without running anything locally. Claude Code pulls context from these services and can act on it, keeping engineers in one terminal.
Why this matters
Deep tool integration – AI coding assistants can now tap live project data and error logs, streamlining everyday developer work.
Low-friction rollout – Remote servers cut setup and upkeep, speeding enterprise adoption of AI-driven workflows.
Security blueprint – Built-in OAuth access shows how future LLM agents can reach sensitive resources while protecting credentials.
🧠RESEARCH
Scaling up the computing power used after prompting large language model agents can significantly boost their reasoning skills. This study finds that methods like trying multiple answers at once, revising responses step-by-step, verifying outputs smartly, and generating more varied results all lead to better performance across tasks.
This paper compares how diffusion-based and traditional language models handle long documents. It finds diffusion models keep stable performance even with very long text and can recall recent info better. The authors introduce LongLLaDA, a simple method to expand these models' memory, showing where diffusion models excel—and where they don't.
Xolver is a new system that boosts reasoning in AI by mimicking how expert teams solve problems—learning from past examples, tools, and peer input. Unlike typical models that start fresh each time, Xolver builds on experience. It outperforms top models on math and coding tasks, even with smaller engines.
🛠️TOP TOOLS
Gamma AI - AI-powered tool designed to revolutionize the creation of presentations, documents, and webpages.
Haiper AI - Designed to simplify and enhance the process of creating high-quality visual content, including videos and images.
Kolors Virtual Try-On - AI-powered tool designed to transform the way users explore and interact with fashion.
FaceCheck ID - AI-powered tool designed to enhance online safety and security through advanced facial recognition technology.
Doctrina AI - AI education assistant designed to revolutionize the way teaching and learning are approached.
📲SOCIAL MEDIA
so looks like LLMs can learn complex reasoning skills without pesky humans telling it what's true and what's false.
instead it can improves it's reasoning abilities based on it's "confidence" in it's answer.
this. is. totally. normal... not. weird. at. all.
— Wes Roth (@WesRothMoney)
2:44 AM • Jun 18, 2025
🗞️MORE NEWS
Meta is in talks to hire former GitHub CEO Nat Friedman and investor Daniel Gross for its AI team. It may also buy part of their fund, NFDG. This follows Meta’s growing AI investments.
OpenAI is cutting ties with Scale AI after Meta’s major investment in the startup. OpenAI had already begun shifting to new data providers, raising doubts about Scale’s neutrality and the future of its core business.
Adobe launched Firefly on iOS and Android, letting users create and edit images or videos with AI. The app supports Adobe and third-party models, boosting mobile creativity and driving a surge in new subscribers.
OpenAI is offering 10–20% discounts on ChatGPT Enterprise for multiyear deals bundled with other tools or API spend, undercutting Microsoft’s typical 5–10% discounts and frustrating its sales team after losing deals.
OpenAI is preparing safeguards for its AI models as they grow more capable in biology, aiming to enable breakthroughs in health and biodefense while preventing misuse, including bioweapon development by untrained or malicious actors.
Elon Musk’s AI startup xAI is reportedly spending $1 billion each month, far outpacing its revenue. The massive burn rate highlights the intense financial pressure of competing in the fast-moving AI industry.
The OpenAI Files project, led by watchdog groups, exposes concerns about OpenAI’s leadership, safety practices, and shift toward profit-driven AGI development. It calls for transparency, accountability, and ethical governance in the race to powerful AI.
What'd you think of today's edition? |
Reply