- NATURAL 20
- Posts
- Hinton Critiques Altman Leadership
Hinton Critiques Altman Leadership
PLUS: AMD Unveils Powerful AI Chips, Pyramid Flow AI Video Generator Launched and more.

Walaaxy, the world's #1 automated prospecting tool 🚀
Waalaxy is the world's #1 tool for automated LinkedIn prospecting, with over 150K users and a 4.8/5 rating from +1,200 reviews.
Why? Because it allows anyone, without any technical skills, to automate LinkedIn prospecting:
- Reach out to 800 qualified prospects per month. 
- Capture your competitors' audience. 
- Test a market for your business. 
- And countless other use cases. 
Turn LinkedIn into your #1 acquisition channel (and get a 10x ROI with the subscription).
Today:
- Hinton Critiques Altman Leadership 
- OpenAI Releases MLE-bench Platform 
- DeepMind Reveal Michelangelo LLM Benchmark 
- AMD Unveils Powerful AI Chips 
- OpenAI Blocks Global Cyber Threats 
- Pyramid Flow AI Video Generator Launched 
"GLAD Sam Altman Was FIRED" Geoffrey Hinton | Nobel Prize in Physics Sparks Controversy
Jeffrey Hinton, a pioneering figure in AI, recently received the 2024 Nobel Prize in Physics for his work on neural networks, though he's primarily a computer scientist. Hinton persisted in his belief in neural networks when many dismissed it, and his research has been foundational in AI advancements.
He expressed concerns about AI safety, emphasizing the need for more research to prevent harmful outcomes. Hinton criticized Sam Altman, CEO of OpenAI, for prioritizing profits over safety. Hinton believes AI will surpass human intelligence within the next two decades, calling for more focus on AI safety efforts.
OpenAI introduces MLE-bench, a tool designed to evaluate the performance of AI agents in machine learning engineering tasks. MLE-bench uses 75 Kaggle competitions to test skills like model training, dataset preparation, and running experiments. Human performance is benchmarked using Kaggle leaderboards.
OpenAI’s top-performing agent, o1-preview with AIDE scaffolding, achieved a bronze medal level in nearly 17% of tasks. The research also explores how scaling resources and pre-training data affect AI agent performance. MLE-bench's code is open-sourced to support further studies in enhancing AI's real-world machine learning engineering abilities.
DeepMind introduced the Michelangelo benchmark to assess large language models (LLMs) with long context windows. While these models excel at retrieving specific data from massive inputs, they struggle with reasoning across complex structures. Michelangelo evaluates LLMs on tasks like processing code, resolving complex conversations, and recognizing knowledge gaps. Results show that even top models, like Gemini and GPT-4, experience performance drops when faced with intricate reasoning tasks.
Michelangelo’s unique evaluation method highlights the challenges of using LLMs in real-world applications where understanding the relationships within vast information is crucial. More tests will be added to refine these benchmarks.
AMD announced new AI-powered chips across its Ryzen, Instinct, and Epyc brands, aimed at enhancing computing for both business and data centers. CEO Lisa Su highlighted AI’s role in improving productivity, privacy, and collaboration. The third-generation Ryzen AI Pro processors offer up to three times more AI performance than previous models and are optimized for business use.
AMD also introduced new AI accelerators for data centers, setting performance benchmarks against Nvidia. The company emphasized open, accessible technology and strong partnerships with companies like Microsoft, aiming to expand AI capabilities across a wide range of industries.
OpenAI disrupted over 20 global malicious operations attempting to misuse its platform for cybercrime and disinformation. These activities involved creating fake social media profiles, debugging malware, and spreading disinformation, including attempts to influence elections in the U.S., Rwanda, and India.
Notable cyber groups mentioned include China's SweetSpecter, Iran's Cyber Av3ngers, and Storm-0817. OpenAI has blocked several influence operations and networks, including those using AI-generated imagery and comments.
Pyramid Flow, a new open-source AI video generator, was launched this week by researchers from Peking University, Beijing University of Posts and Telecommunications, and Kuaishou Technology. It uses a technique called "pyramidal flow matching" to generate high-quality video clips efficiently, saving high resolution for the final stage.
Available on GitHub and Hugging Face, Pyramid Flow rivals proprietary models like Runway’s Gen-3 Alpha but is free for commercial use under the MIT License. Though lacking some advanced features, its open-source nature provides flexibility for developers and creators seeking alternatives to costly, closed models.
🧠RESEARCH
The paper introduces a pyramidal flow matching method for efficient video generation, reducing computational complexity by operating only the final stage at full resolution. The approach links flows across pyramid stages to maintain continuity and enables end-to-end optimization. The method generates high-quality videos with fewer resources.
CursorCore is a conversational framework for programming assistance that integrates coding history, current code, and user instructions. It introduces APEval, a benchmark to evaluate model alignment with diverse information types. The authors created 219K training samples using a data generation pipeline and showed CursorCore's superior performance over similar models.
TextToon is a method for creating a real-time, toonified head avatar from a single video based on text instructions. Unlike existing multi-view models, TextToon uses a conditional embedding system to generate high-quality, stylized avatars that can be driven by any video. It works at 48 FPS on GPUs and 15-18 FPS on mobile devices.
IterComp is a framework that improves text-to-image generation by combining strengths from multiple models. It uses iterative feedback learning to refine compositions based on key metrics like attribute binding and spatial relationships. IterComp significantly outperforms previous methods in handling complex object compositions and semantic alignment in generated images.
AutoDAN-Turbo is a method for automatically discovering jailbreak strategies to attack large language models (LLMs) without human input. It significantly outperforms existing methods, achieving a 74.3% higher success rate on benchmarks and an 88.5% success rate on GPT-4-1106-turbo. Incorporating human-designed strategies boosts its success rate to 93.4%.
🛠️TOP TOOLS
OpenAI Gradio - Create apps for developers powered by OpenAI’s API.
Wispr Flow - Use your voice to write 3x faster in every application: AI commands, auto-edits, 100+ languages
Udio - Create any song. Just describe it.
Scenery Video Editor - Let AI edit videos for you. Then fine tune what AI creates with a fully-featured, collaborative cloud editing platform.
Google Illuminate - Transform your content into engaging AI‑generated audio discussions
📲SOCIAL MEDIA
🗞️MORE NEWS
- Intel announced its new Core Ultra 200S desktop processors, featuring built-in NPUs (Neural Processing Units) for the first time in desktop chips. While not significantly faster than previous models, the processors focus on reducing power consumption and improving efficiency, similar to AMD's approach. 
- Tesla is set to reveal its highly anticipated "robotaxi," potentially marking a major milestone in autonomous vehicles. Unlike competitors such as Waymo, Tesla uses a simpler and cheaper approach relying on AI-powered cameras without additional sensors like radar or lidar. 
- Microsoft has introduced new AI tools for healthcare, enhancing patient care, improving workflows, and aiding medical professionals. These innovations in Azure, Copilot, and Fabric optimize data integration, automate tasks, and help address workforce shortages. 
- Healthcare startup Suki raised $70 million in a Series D round to expand its AI assistants for hospitals. Suki’s AI solutions reduce administrative tasks for healthcare providers and integrate with major Electronic Health Record systems. 
- Headspace has launched "Ebb," a new AI-powered chatbot designed to offer personalized mental health support. Trained by clinical psychologists and data scientists, Ebb uses motivational interviewing to help users reflect on events or practice gratitude. 
- Amazon's AI shopping assistant, Rufus, now allows some users to check price history during their shopping experience. This new feature aims to help customers determine if a deal is genuine, enhancing transparency and trust. 
- New AI models for plasma heating developed at Princeton Plasma Physics Laboratory enhance fusion research by speeding up predictions 10 million times without losing accuracy. These models fixed errors in the original code, enabling faster and more accurate simulations. 
| What'd you think of today's edition? | 
| Learn AI with us. Let’s Build the Future Together. | 
| Hello fellow AI-obsessed traveler, Over the past 2 years, as we’ve grown to over 250,000 subscribers between the YouTube Channel and this newsletter, we've received an overwhelming number of requests for one specific thing. While the newsletter helps keep you up to speed with AI news, many of you have asked for the next step: to learn how to actually apply AI in your work. Today we’re finally announcing the solution with NATURAL 20, the community for like-minded AI learners. As a loyal newsletter reader you are getting access at the lowest price it will ever be: JOIN NATURAL 20 AI UNIVERSITY TODAY What you get: * Tutorials by experts across various AI fields. * Daily tutorials by Wes Roth about the latest use cases. * Building Autonomous AI Agents to Automate Your Life and Business (NEW!) * A network of the top 1% of early AI adopters. * Access to community-only resources and software. * And many more features rolling out soon. | 



Reply