NATURAL 20
Posts
OpenAI Launches Robotics Innovation Team

OpenAI Launches Robotics Innovation Team

PLUS: Ex-OpenAI Leader Joins Murati’s Team, Luma Unveils Powerful Ray2 Model and more.

Wes Roth
January 17, 2025

In partnership with

SUBSCRIBE | AI TOOLS | LEARN AI

Automate Prospecting Local Businesses With Our AI BDR

Struggling to identify local prospects? Our AI BDR Ava taps into a database of 200M+ local Google businesses and does fully autonomous outreach—so you can focus on closing deals, not chasing leads.

Ava operates within the Artisan platform, which consolidates every tool you need for outbound:

300M+ High-Quality B2B Prospects
Automated Lead Enrichment With 10+ Data Sources Included
Full Email Deliverability Management
Personalization Waterfall using LinkedIn, Twitter, Web Scraping & More

Book a demo and supercharge your sales team

Today:

OpenAI Launches Robotics Innovation Team
Microsoft Introduces MatterGen for Materials
NVIDIA Unveils NIM Microservices Safeguards
Ex-OpenAI Leader Joins Murati’s Team
Luma Unveils Powerful Ray2 Model
Black Forest Labs Launches API

OpenAI Launches Robotics Innovation Team

OpenAI is expanding into robotics, hiring for key roles like systems integration engineers and mechanical product engineers to build robots with advanced AI capabilities. The team aims to develop general-purpose robots with AG-level intelligence for dynamic real-world environments. This move marks OpenAI’s most significant investment in robotics, signaling competition with companies like Figure and a stronger focus on blending hardware and software for cutting-edge AI solutions.

Microsoft Introduces MatterGen for Materials

Microsoft Research has introduced MatterGen, a generative AI tool that revolutionizes materials design by directly generating new materials based on specific application requirements. Unlike traditional screening, which sifts through existing materials, MatterGen creates novel compounds with tailored properties, such as chemistry and mechanical strength. Trained on vast databases, it outperforms screening methods, offering an efficient way to explore unknown materials. MatterGen could significantly impact industries like battery and fuel cell development.

NVIDIA Unveils NIM Microservices Safeguards

NVIDIA has introduced new NIM microservices as part of its NeMo Guardrails suite to enhance the safety, security, and performance of AI applications. These tools help businesses build trustworthy AI agents for customer service by ensuring safe, ethical, and contextually appropriate interactions. Industry leaders like Amdocs, Cerence AI, and Lowe’s are already using the microservices to improve AI-driven customer experiences, while NVIDIA also offers open-source tools like Garak for testing AI vulnerabilities.

Ex-OpenAI Leader Joins Murati’s Team

Mira Murati's AI startup has made significant moves by hiring Jonathan Lachman, former head of special projects at OpenAI, marking her first major recruit. Since leaving OpenAI in September 2024, Murati's venture has attracted talent from various leading AI firms. The startup, focused on artificial general intelligence, is generating considerable attention within the AI community.

Luma Unveils Powerful Ray2 Model

Luma has launched Ray2, an advanced AI video model that offers 10 times the computing power of its predecessor. The new multimodal architecture enhances natural movement and fine details, ensuring consistent scenes. Ray2 is available for text-to-video generation via Luma's Dream Machine platform, with additional features like image-to-video and video editing tools coming soon. The company plans to make the model accessible through an API in the near future.

Black Forest Labs Launches API

Black Forest Labs has launched an API that allows users to fine-tune its FLUX Pro AI image model with as few as five sample images, enabling the customization of visual styles and brand identities. The API maintains the model's flexibility while incorporating specific user content. Burda Verlag uses this API to create brand-specific images for its Lissy PONY children's brand. The service is currently in beta, with further details on pricing and availability yet to be announced.

🧠RESEARCH

Towards Best Practices for Open Datasets for LLM Training

The paper discusses the challenges and legal complexities of training large language models (LLMs) on open datasets. While some regions allow this, the trend of limiting dataset transparency harms innovation. The authors emphasize the need for collaboration in creating openly licensed, responsibly curated datasets to ensure accountability and progress in AI development.

MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents

MMDocIR, a benchmark for evaluating multi-modal document retrieval systems. It supports page-level and layout-level tasks, focusing on retrieving relevant content like figures and tables. With 1,685 expert-labeled and 173,843 bootstrapped questions, it highlights visual retrievers' superiority and the importance of visual elements in enhancing multi-modal retrieval performance.

CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities

CityDreamer4D, a generative model for creating realistic, unbounded 4D cities by separating dynamic objects (e.g., vehicles) from static elements (e.g., buildings). Using neural fields and datasets like OSM and Google Earth, it achieves state-of-the-art city generation, enabling applications like city stylization, urban simulation, and instance editing.

3DIS-FLUX: simple and efficient multi-instance generation with DiT rendering

3DIS-FLUX, an extension of the 3DIS framework, for efficient multi-instance image generation. By integrating the FLUX model for depth-controlled rendering and fine-grained attribute manipulation, it eliminates frequent retraining needs. Experiments show 3DIS-FLUX surpasses prior 3DIS methods and adapter-based approaches in performance and image quality, advancing text-to-image generation.

Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Omni-RGPT, a multimodal large language model for region-level understanding of images and videos. Using Token Mark, it aligns visual and text prompts for consistent region representation. Enhanced by a large-scale dataset (RegVID-300k), Omni-RGPT achieves state-of-the-art results in commonsense reasoning, captioning, and referring expression comprehension tasks.

🛠️TOP TOOLS

Nero AI Image Upscaler - AI-powered tool designed to enhance and enlarge digital images while maintaining quality.

Droxy AI - No-code chatbot builder that enables users to create intelligent AI chatbots from various content sources.

Power Drill AI - Advanced data analysis platform that simplify complex data interactions.

Waveformer - Open-source web application that transforms text inputs into music.

AnySummary - AI-powered tool that efficiently condenses various file types, including text documents, audio recordings, and video content, into concise summaries.

📲SOCIAL MEDIA

3D arrived to Krea.
this new feature lets you turn images into 3D objects and use them in our Real-time tool.
free for everyone.
— KREA AI (@krea_ai)
4:32 PM • Jan 16, 2025

🗞️MORE NEWS

Apple has disabled AI-powered news summaries in its beta software after errors, such as inaccurate headlines, were flagged. This move highlights ongoing challenges with Apple Intelligence, a key feature introduced with recent iPhone models.
Mistral AI has partnered with AFP to enhance Le Chat, its AI chatbot, by incorporating AFP's news content. This agreement allows Le Chat to access AFP's vast archive, improving response accuracy in six languages.
Point72's AI-focused fund, launched in October 2024, grew to nearly $1.5 billion with a 14% return, showcasing market optimism about AI. It's Point72's first new fund in decades, outperforming Nasdaq benchmarks.
Legal AI startup Harvey is raising $300 million in funding led by Sequoia Capital, doubling its valuation to $3 billion. The San Francisco-based company develops generative AI tools for law firms.
Italian AI startup iGenius plans to complete a $1 billion Nvidia-powered data center in southern Italy by summer 2025. It launched a new large language model targeting highly regulated industries for secure, efficient AI solutions.
Google's new Titans neural network architecturecombines attention and neural memory layers to handle long sequences efficiently, reducing costs and enabling scalable AI models. Titans outperform GPT-4 in long-sequence tasks and promise enterprise-friendly applications.

What'd you think of today's edition?

Reply

or to participate.