NATURAL 20
Posts
Breakthroughs in Robotics

Breakthroughs in Robotics

PLUS: Nvidia Dominates AI Market, Tech Titans Invest in Humanoid and more.

Wes Roth
February 24, 2024

Today:

"EVERY machine that moves will be AUTONOMOUS" OpenAI Robot, Google's Fiasco and NVIDIA's GEAR!

OpenAI's CEO hints at significant updates in robotics, while Google's AI prompts controversy due to racial bias. Gemini's image generation feature inadvertently produces offensive results, highlighting transparency issues in AI development. Elon Musk assures corrective action.

Lastly, NVIDIA's ambitious project in embodied AI research, suggesting we're on the cusp of a new era where AI could solve fundamental scientific and societal challenges, potentially leading to a future of abundance and innovation previously confined to science fiction.

WATCH THE VIDEO ON YOUTUBE

Microsoft Releases Red Teaming Tool for Generative AI

Microsoft launches PyRIT, a red teaming tool to identify risks in generative AI. It automates tasks, flags areas for investigation, and enhances manual red teaming. Red teaming generative AI differs from probing traditional systems due to its probabilistic nature and varied architectures. PyRIT aids in pinpointing security and responsible AI risks, considering generative AI's nuanced output and potential inaccuracies.

The tool, available on GitHub, offers control over strategy, generates harmful prompts, and adapts tactics based on AI system responses. PyRIT aims to improve industry-wide AI red teaming capabilities, augmenting expertise and streamlining analysis.

SECURITY WEEK

AWS will add Mistral open source AI models to Amazon Bedrock

Amazon Web Services (AWS) plans to integrate Mistral's open source AI models, Mistral 7B and Mixtral 8x7B, into its Amazon Bedrock service. Mistral 7B prioritizes efficiency, supporting various tasks like text summarization and code completion with low memory usage. Mixtral 8x7B, more potent, handles tasks in multiple languages with its Mixture-of-Experts model. AWS aims for a cost-effective, speedy, transparent, and customizable AI solution, aligning with Mistral's model traits.

This move mirrors Microsoft's addition of Meta's Llama AI models to Azure AI Studio. It coincides with Amazon's investments in Anthropic and internal AI model development. The race among cloud providers intensifies for superior AI offerings.

VENTUREBEAT

Nvidia’s role in the AI wave has made it a $2 trillion company

Nvidia, a California-based chipmaker, has surged to a $2 trillion market cap, becoming the first in its league. Riding high on AI chip dominance, it’s now the third US company to hit this milestone, trailing Apple and Microsoft. Bolstered by a record $60.9 billion revenue in 2023, Nvidia’s growth trajectory seems unstoppable.

Despite contenders like OpenAI and Microsoft developing their AI chips, Nvidia's prowess remains unmatched. With plans to release the H200 GPU and delve into a $30 billion venture making custom AI chips, Nvidia's ascent shows no signs of slowing down in the cutthroat chip market.

THE VERGE

OpenAI updates GPT Store with ratings and expanded builder profiles

OpenAI beefs up GPT Store post-launch, adding user ratings and more builder info. Now, users can rate third-party chatbots on a 1-5 star scale and offer private feedback. Builders can enhance profiles with links to LinkedIn, X, and websites, alongside average rating and total conversations stats.

Despite promises of revenue sharing based on usage, details remain murky. These updates follow a recent glitch where ChatGPT spat out nonsensical responses. OpenAI aims to foster responsible AI growth amid tech advancements, striving for transparency and user engagement in the evolving chatbot marketplace.

VENTUREBEAT

Jeff Bezos and Nvidia join OpenAI and Microsoft in backing a humanoid robot unicorn valued at $2 billion, sources say

Tech titans Jeff Bezos and Nvidia, along with firms like Microsoft, are investing in Figure AI Inc., a startup pioneering humanoid robots. The company, valued at $2 billion, is raising $675 million. Figure aims to create robots capable of performing dangerous tasks, addressing labor shortages. Notable backers include Intel, LG Innotek, and Samsung. OpenAI, once considering acquiring Figure, is also investing.

The move reflects a growing interest in AI-driven robotics, with other startups like Sanctuary AI and Tesla venturing into the field. Bezos, with a net worth of $197.1 billion, continues his tech investments post-Amazon CEO tenure.

FORTUNE

🧠RESEARCH

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

OpenCodeInterpreter is a new tool that bridges the gap between open-source code generation and the sophisticated GPT-4 Code Interpreter. It can generate, run, and refine code, supported by a dataset of 68K interactions for improving with feedback. It almost matches GPT-4 in tests, showing how open-source can compete with proprietary tech in coding tasks.

Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

"Searchformer" is a game-changer in solving complex puzzles like Sokoban. Unlike the traditional A* method, Searchformer, a Transformer-based model, cracks these puzzles 93.7% of the time and does it more efficiently—using fewer steps. It's trained to mimic and then outdo A*'s planning process, showcasing the power of Transformers in strategic decision-making tasks, proving smaller models and datasets can still deliver top-notch performance.

PALO: A Polyglot Large Multimodal Model for 5B People

PALO is a groundbreaking Vision-Language Model (VLM) designed for inclusivity, supporting visual reasoning in 10 major languages covering 65% of the global population. It leverages a semi-automated translation technique for dataset adaptation, ensuring linguistic fidelity with minimal manual effort. This approach significantly enhances performance, particularly for underrepresented languages. PALO is tested across different scales, demonstrating notable advancements over existing models and introduces a multilingual multimodal benchmark for future evaluations.

Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis

"Snap Video" is a cutting-edge model tailored for video generation that overcomes the limitations of traditional image models when applied to videos. By addressing spatial and temporal redundancy, Snap Video enhances motion fidelity, visual quality, and scalability. It employs a new transformer-based architecture, significantly speeding up training and inference compared to U-Nets. This advancement enables the creation of high-quality, complex motion videos from text, outperforming existing methods and receiving strong preference in user studies.

Subobject-level Image Tokenization

The paper introduces a novel approach to image tokenization, moving beyond traditional fixed-size patches to a subobject level, focusing on semantically meaningful image segments. This method, inspired by subword tokenization in language models, uses a Sequence-to-sequence AutoEncoder (SeqAE) to compress these segments into compact embeddings for vision language learning. Empirical results show this approach significantly improves the learning process for translating images into object and attribute descriptions, compared to patch-level tokenization. This advancement promises enhanced efficiency and effectiveness in vision-language model training.

🛠️TOP TOOLS

Scandilytics AI - AI-driven insights & automated reporting for your eCommerce

Agent Gold - Chat with any YouTuber

SpellcraftAI - Python library for making efficient, rate-limited, asynchronous batch requests to the OpenAI API.

AI Voice Changer - transform your voice into different characters with control over emotion and delivery.

Stable Cascade - a new text-to-image model that builds on the Würstchen architecture.

🗞️MORE NEWS

Google explains Gemini’s ‘embarrassing’ AI pictures of diverse Nazis

Google explains issues with its Gemini AI tool, admitting it produced inaccurate images due to tuning problems. The model aimed for diversity but overcompensated, generating racially diverse Nazis and being overly cautious in other cases. Google apologizes, halts image generation, and pledges to improve the tool before relaunching. THE VERGE

Humane pushes Ai Pin ship date to mid-April

Humane's Ai Pin launch has been postponed to mid-April, citing typical hardware challenges. The startup, backed by significant funding, aims to offer a novel form factor utilizing generative AI, breaking away from the smartphone norm. Despite setbacks and recent layoffs, preorders remain open with incentives for early adopters. TECHCRUNCH

Microsoft adds more AI to Photos in Windows 10 and 11

Microsoft is infusing more AI into Windows 10 and 11 Photos, introducing features like Generative erase and background effects. Despite Windows 10's upcoming end of support in 2025, Microsoft continues to add AI functionality, aiming to compete with rival platforms. Updates will roll out to Windows Insiders, signaling Microsoft's commitment to AI innovation. THE REGISTER

Armenia’s 10web brings AI website-building to WordPress

Armenian startup 10web utilizes AI to simplify website creation on WordPress, a complex platform, aiming to tap into its vast user base. With a focus on usability and leveraging Armenia's tech talent, 10web anticipates substantial growth, targeting the US market for funding and expansion, capitalizing on its advantageous cost structure. TECHCRUNCH

Healthcare startup Abridge raises $150 mln for AI model for clinicians

The healthcare startup Abridge secured $150 million in Series C funding for its AI-driven clinical documentation tools, valuing the company at $850 million. Led by Lightspeed Venture Partners and Redpoint Ventures, the investment will fuel expansion, hiring, and AI model development for medical applications. REUTERS

Dozens Of KFC, Taco Bell And Dairy Queen Franchises Are Using AI To Track Workers

Dozens of KFC, Taco Bell, and Dairy Queen franchises in the US are implementing an AI system called Riley to track and analyze employee-customer interactions. Riley rewards workers who excel in sales performance with bonuses, marking a shift towards AI-driven performance evaluation in the fast-food industry. FORBES

This AI app will soon screen for type 2 diabetes using just a 6-10 second voice clip

A revolutionary AI app developed by Klick Health in Toronto, Canada, utilizes voice analysis to potentially screen for type 2 diabetes using a mere 6-10 second voice clip. This innovative technology offers a convenient, cost-effective, and non-invasive method for early detection of the disease, potentially saving lives and improving healthcare accessibility. ZDNET