- NATURAL 20
- Posts
- Google’s “Project Jarvis”
Google’s “Project Jarvis”
PLUS: Meta Partners with Reuters for AI News, Google Prepares for Gemini 2.0 Launch and more.
The fastest way to build AI apps
Writer Framework: build Python apps with drag-and-drop UI
API and SDKs to integrate into your codebase
Intuitive no-code tools for business users
Today:
Google’s “Project Jarvis”
OpenAI’s Orion Launches This December
Meta Introduces NotebookLlama for Podcasts
Google Prepares for Gemini 2.0 Launch
Meta Partners with Reuters for AI News
Anthropic Tackles Bias with Feature Steering
Apple Expands Bounties for PCC Security
Google's Surprising AI Agents Team | Ungovernable AGI Virus
Google is set to launch an autonomous AI agent capable of handling computer tasks, like web browsing, product purchases, and travel bookings. This follows Anthropic's announcement of their own AI agent with enhanced models. Other companies in the AI field are also gearing up to release similar agents, intensifying the competition.
Google originally sparked the AI movement in 2017 with its Transformer architecture—a groundbreaking innovation in language processing—but has since fallen behind. Now, the race is heating up as major tech firms strive to launch advanced AI agents that can operate computers autonomously.
OpenAI plans to launch a new AI model, codenamed Orion, by December, though initial access will be limited to key partner companies, such as Microsoft. Orion is anticipated as a significant leap from GPT-4, potentially 100 times more powerful. This development aligns with OpenAI’s recent $6.6 billion funding and a shift toward a more profit-focused structure.
OpenAI’s CEO, Sam Altman, has hinted at the release, though the company denies official plans for Orion. This release comes amid leadership changes, including departures of top executives, which reflects ongoing internal restructuring.
Meta has introduced NotebookLlama, an “open” alternative to Google’s podcast-generation tool, NotebookLM. NotebookLlama uses Meta’s Llama models to transform text files, like PDFs, into podcast-style conversations by creating transcripts, adding dramatization, and using open-source text-to-speech models.
While promising, the audio quality currently feels robotic, with interruptions and overlapping voices, due to limitations in text-to-speech technology. Meta’s researchers suggest future improvements may involve stronger models or a dual-agent approach for more engaging podcast dialogues.
Google's next-generation AI model, Gemini 2.0, is rumored to launch in December. Expected upgrades include smarter, faster responses, support for longer text inputs, and improved reasoning capabilities. While some speculate Gemini 2.0 may launch alongside OpenAI’s rumored ChatGPT-5, Google’s update reportedly hasn’t met all performance improvement targets, a common trend for advanced AI models.
This release aligns with Google's broader AI expansion, which includes video tools, image generation, and search enhancements. Project Astra, Google’s advanced assistant seen at I/O 2024, could also benefit from Gemini 2.0’s advancements, though it remains unconfirmed.
Meta has signed a multi-year deal with Reuters to integrate news content into its AI chatbot, allowing it to provide news-related answers with links to Reuters articles. This partnership, a first for Meta, will enable users on Facebook, Instagram, WhatsApp, and Messenger to access news summaries linked to Reuters.
Although Meta traditionally avoided promoting news content on its platforms, the company now seeks to enhance its AI's informational utility while still controlling the format. Meanwhile, Meta continues to oppose laws mandating payments to news publishers, evident in its decision to block news in Canada instead of complying with new regulations.
In their study on feature steering, researchers at Anthropic explored modifying Claude 3 Sonnet’s outputs by adjusting specific model features to address social biases. Their experiments revealed a "sweet spot" where feature steering could alter outputs without diminishing the model's overall abilities.
By steering 29 features, they observed targeted changes in biases, like reducing age or gender bias. However, some adjustments caused unexpected effects in unrelated areas, termed "off-target effects."
Apple has introduced its Private Cloud Compute (PCC) system, designed to perform complex AI tasks with robust security and privacy protections. PCC allows researchers to verify its security through a Virtual Research Environment (VRE) and released source code for core components, like CloudAttestation.
This initiative extends Apple’s device-level security into cloud AI, ensuring data confidentiality. Researchers can inspect PCC's architecture, run simulations, and review its transparency. Additionally, Apple has expanded its Security Bounty program to reward vulnerabilities in PCC, aiming to enhance PCC's security through community input.
🧠RESEARCH
LOGO, a training method to improve long-context models (LCMs), which process vast text sequences. LOGO uses preference optimization without extra references, solving memory issues with efficient data handling. This approach allows a smaller model to match GPT-4’s performance in handling long-context tasks and boosts overall text generation quality.
The paper presents a new method for scaling batch sizes in contrastive learning without overwhelming GPU memory. By using a tile-based approach, it breaks large similarity calculations into smaller parts, significantly reducing memory usage while retaining performance. This method allows massive batch sizes, improving efficiency for large-scale models like CLIP, with code to be shared publicly.
MotionCLR, a model for generating and editing human motion using attention mechanisms. Unlike previous models, MotionCLR captures detailed word-motion alignment, enabling fine-grained edits. Through self- and cross-attention, it enhances motion sequence organization and editing precision, allowing actions like emphasizing, replacing, or generating specific motions with high explainability and control.
PULSE, a multimodal language model (MLLM) designed to interpret electrocardiographic (ECG) images, addressing limitations in existing methods. Leveraging ECGInstruct, a large, diverse ECG image dataset, PULSE significantly improves ECG comprehension, achieving up to 30% higher accuracy than general models. This advancement supports broader clinical ECG interpretation, especially in resource-limited settings.
MMAU, a comprehensive benchmark for evaluating AI models in complex audio understanding and reasoning. MMAU includes 10,000 annotated audio clips, covering speech, environmental sounds, and music, and requires 27 specialized skills. Testing on advanced models shows significant challenges, with top models achieving only around 53% accuracy, indicating substantial room for advancement in audio comprehension.
🛠️TOP TOOLS
Pro Search by Perplexity - Ask multi-layered questions. Perplexity will adapt.
RapidSubs Captions & Subtitle - AI-powered subtitle generator designed to create fast, stylish subtitles for your videos.
ScrapetoAI - Extract data from websites for your custom GPTs
Dreamcut - ’AI-Powered Video Editing & Screen Recording
Playground V3 - AI model focused on graphic design
📲SOCIAL MEDIA
Finally, a humanoid robot with a natural, human-like walking gait.
Chinese company EngineAI just unveiled their life-size general-purpose humanoid SE01.
— The Humanoid Hub (@TheHumanoidHub)
7:29 AM • Oct 24, 2024
🗞️MORE NEWS
OpenAI disbanded its AGI Readiness team, which assessed the company's preparedness for highly advanced AI. Team leader Miles Brundage resigned, citing a shift to independent research. Ex-members will be reassigned within OpenAI.
Perplexity AI now handles 100 million search queries weekly, up from 250 million monthly in July. The company is exploring e-commerce, ads, and partnerships with brands like Nike, despite ongoing legal issues with publishers over content use.
A new AI model predicts diarrheal disease outbreaks triggered by extreme weather events linked to climate change. Developed by international researchers, it provides public health systems in vulnerable regions with early warnings, allowing timely preparation.
President Biden's memorandum directs the U.S. government to enhance national security through responsible AI use, emphasizing partnerships with industry and academia. It seeks global AI leadership, safety, and ethical governance while protecting human rights.
Taiwan Semiconductor Manufacturing Co. (TSMC) reported that its Arizona plant achieved higher chip production yields than its Taiwan facilities, marking a positive milestone for U.S. chipmaking ambitions amid prior challenges.
What'd you think of today's edition? |
Learn AI with us. Let’s Build the Future Together. |
Hello fellow AI-obsessed traveler, Over the past 2 years, as we’ve grown to over 250,000 subscribers between the YouTube Channel and this newsletter, we've received an overwhelming number of requests for one specific thing. While the newsletter helps keep you up to speed with AI news, many of you have asked for the next step: to learn how to actually apply AI in your work. Today we’re finally announcing the solution with NATURAL 20, the community for like-minded AI learners. As a loyal newsletter reader you are getting access at the lowest price it will ever be: JOIN NATURAL 20 AI UNIVERSITY TODAY What you get: * Tutorials by experts across various AI fields. * Daily tutorials by Wes Roth about the latest use cases. * Building Autonomous AI Agents to Automate Your Life and Business (NEW!) * A network of the top 1% of early AI adopters. * Access to community-only resources and software. * And many more features rolling out soon. |
Reply