• NATURAL 20
  • Posts
  • OpenAI Model Nearly Wins World Coding Competition

OpenAI Model Nearly Wins World Coding Competition

PLUS: Veo 3 Launches in API, Perplexity Valued At $18 Billion and more.

Today:

  • OpenAI Model Nearly Wins World Coding Competition

  • OpenAI Unveils ChatGPT agent

  • Mistral Le Chat Adds Voice, Projects, Images, Deep Research

  • Veo 3 Launches in API

  • Perplexity Valued At $18 Billion

OpenAI's Secret INTERNAL Model Almost Wins World Coding Competition…

An OpenAI coding model nearly beat all human competitors at the AtCoder World Finals in Japan, staying in the lead for most of the 10-hour contest before ex-OpenAI engineer Siho narrowly overtook it. 

This milestone shows AI models are approaching superhuman coding performance, well ahead of OpenAI’s end-of-year expectations. While impressive, experts argue these models enhance rather than replace human engineers, especially in real-world, creative, or less structured coding tasks.

OpenAI has launched ChatGPT Agent, a major upgrade that lets ChatGPT perform tasks on your behalf using a virtual computer. It can browse websites, analyze data, automate workflows, and complete real-world jobs—like planning meetings or making presentations. Users stay in control with permissions and safety checks. Available now for Pro, Plus, and Team users, this marks a key step in AI acting as a true assistant, not just a chatbot.

Why This Matters

  1. Marks the Shift from Chatbot to Worker:
    ChatGPT Agent turns AI into an action-taking assistant that can complete end-to-end tasks autonomously.

  2. Breakthrough in Multi-Tool Integration:
    It combines web browsing, code execution, API access, and user-authenticated actions—all in one coherent system.

  3. Raises the Stakes for AI Safety and Control:
    With real-world actions comes real-world risk, prompting OpenAI to implement its most advanced safety safeguards yet.

Mistral AI has upgraded Le Chat with powerful new features, including Deep Research for fast, structured reports, Voxtral voice input, multilingual reasoning via Magistral, image editing, and Projects to organize conversations. These tools help users go deeper, stay organized, and interact naturally across text, voice, and images. Le Chat now acts more like a true assistant, making research, planning, and creative tasks easier and more intuitive.

Why This Matters

  1. Raises the Bar for AI Assistants:
    Le Chat’s Deep Research mode blends synthesis and citation, competing directly with OpenAI’s agentic systems in knowledge work and report generation.

  2. Multimodal Productivity Leap:
    With integrated voice, image, and multilingual capabilities, Mistral advances toward a seamless AI workspace that supports diverse user needs.

  3. User-Centric Innovation:
    Features like Projects and editable images emphasize customization and real-world usability—shaping how next-gen AI tools adapt to users, not the other way around.

Google’s Veo 3 is now available to developers through the Gemini API and Google AI Studio. This advanced video model generates high-quality video with synchronized audio, realistic physics, and cinematic effects from simple prompts. Developers can use Veo 3 for storytelling, animation, or game design. With tools, templates, and pricing at $0.75 per second, it’s accessible for rapid prototyping. All outputs include SynthID watermarks to ensure responsible use.

Why This Matters

  1. Raises the Bar for Text-to-Video Generation:
    Veo 3 combines visuals, dialogue, and effects into cohesive, high-quality videos, pushing the frontier of multimodal AI.

  2. Accelerates Creative Workflows:
    Game designers, animators, and content creators can quickly prototype scenes with minimal manual work.

  3. Sets a Standard for Responsible AI Media:
    By embedding SynthID watermarks, Google reinforces traceability and ethics in AI-generated content.

🧠RESEARCH

This paper reviews how combining retrieval and reasoning helps AI models answer complex questions better. It explains how smarter searching boosts logic, and vice versa, leading to systems that solve tough problems more accurately. It also outlines top methods, challenges, and future goals for building more reliable, human-friendly AI tools.

PhysX introduces a new way to generate 3D assets that behave more like real-world objects. It provides a physics-annotated dataset and a model that adds physical traits—like weight, material, and movement—into 3D designs. This helps make virtual objects more useful for simulations, robotics, and other real-world applications.

SWE-Perf is a new benchmark that tests how well language models can speed up real-world code. It uses real GitHub updates where experts improved performance. Results show current models still fall short of expert-level optimizations, pointing to a big opportunity for progress in making AI better at real coding tasks.

🛠️TOP TOOLS

ArtGuru Face Swap - AI tool designed to make face swapping in photos an effortless and enjoyable process.

Nokemon - A tool that leverages advanced ML technology to create unique and customizable Pokémon designs, often referred to as “Fakémon.”

NeuralBlender - Image-generation website that leverages the power of AI to create stunning images from textual descriptions.

CrushOnAI - AI chat platform that specializes in providing unrestricted, NSFW (Not Safe For Work) conversations.

Charley AI - AI-powered content generation platform designed to assist with academic and professional writing.

📲SOCIAL MEDIA

🗞️MORE NEWS

  • AI search startup Perplexity hit an $18 billion valuation just two months after a $14 billion round. Its fast-growing user base, new AI browser, and rising revenues attract top investors, despite fierce competition with Google.

  • OpenAI uses various global companies to help run its services, including cloud providers like Microsoft and Google, and support firms in places like the Philippines. These partners handle data storage, support, and moderation.

  • Elon Musk’s xAI plans a male AI companion for Grok, modeled on Twilight’s Edward Cullen and Fifty Shades’ Christian Grey. Critics fear it promotes unhealthy, controlling behavior after sexually charged tests with Grok’s previous female version.

  • Anthropic quietly tightened usage limits for Claude Code, frustrating paying users who weren’t notified. Many report sudden service cutoffs despite premium plans. The lack of transparency has shaken trust and stalled critical projects.

  • Adobe’s new AI tool lets users turn their own silly sounds into realistic sound effects for videos. It also adds better video controls and creative styles, helping users fine-tune audio and visuals with ease.

What'd you think of today's edition?

Login or Subscribe to participate in polls.

Reply

or to participate.