Reflection 70b Sparks AI Debate

PLUS: Google Launches Gemini Live Chatbot, DeepMind Introduces ALOHA, DemoStart AI and more.

Today:

  • Reflection 70b Sparks AI Debate

  • Meta Trains AI with UK Content

  • Google Launches Gemini Live Chatbot

  • Apple Introduces On-Device UI-JEPA

  • DeepMind Introduces ALOHA, DemoStart AI

  • Poolside Nears $3B Valuation

  • EvenUp Nears $1B Valuation

this AI is a little bit TOO good…

Matt Schumer, CEO of hyperr AI, announced an open-source language model called reflection 70b, claiming it surpassed leading models like GPT-4. He credited its success to a new method called reflection tuning and mentioned collaboration with glaive AI, a company he invested in. 

Initially celebrated, the model faced scrutiny when users found it underperformed and suspected it was using responses from another AI called Claude. This raised concerns about misrepresentation and possible deceit. Both Schumer and glaive AI's founder issued apologies, but the incident damaged their credibility and trust within the AI community.

Building AI Technology for the UK in a Responsible and Transparent Way

Meta is beginning to train its AI using public content shared by adults on Facebook and Instagram in the UK. This approach aims to make their AI models better reflect British culture, history, and language. After pausing earlier to address concerns from the UK's Information Commissioner's Office (ICO), Meta has made changes to be more transparent. They emphasize that they won't use private messages or data from users under 18. 

Starting next week, UK adults will receive notifications explaining how they can opt out if they don't want their public data used for AI training. The opt-out process has been made simpler and easier to find.

Google rolls out voice-powered AI chat to the Android masses

Google has released "Gemini Live," a voice-activated AI chatbot now available for free to all Android users. This feature lets people talk to Gemini using their voice and receive spoken answers. Users can have conversations, interrupt the AI mid-sentence, and choose different voice styles for responses. 

Unlike similar features from competitors that aren't widely available yet, Gemini Live is accessible to everyone on Android devices. Currently, it supports only English, but Google plans to add more languages and bring it to iPhone users soon. 

Apple aims for on-device user intent understanding with UI-JEPA models

Apple researchers have created UI-JEPA, a new AI model that understands what users want to do based on how they interact with their devices. Unlike large AI models that require a lot of computing power and run on distant servers, UI-JEPA is small enough to work directly on your device. 

This makes it faster and better at protecting your privacy. It learns by focusing on the important parts of your actions without needing labeled data. The researchers also made new testing tools to show that UI-JEPA works well compared to bigger models. 

Our latest advances in robot dexterity

Google DeepMind has developed two new AI systems, ALOHA Unleashed and DemoStart, to improve robot dexterity. ALOHA Unleashed enables robots with two arms to learn complex tasks like tying shoelaces and hanging shirts by mimicking human demonstrations. DemoStart uses computer simulations to teach a robot hand to perform delicate tasks such as tightening screws or inserting plugs. 

It starts with simple actions and gradually tackles harder ones, requiring far fewer examples than traditional methods. These advancements help robots handle objects more skillfully, bringing them closer to performing useful tasks in everyday environments.

AI startup Poolside nears $3 billion valuation before ever releasing product

AI startup Poolside, founded in early 2023 by former GitHub Chief Technology Officer Jason Warner and software entrepreneur Eiso Kant, is reportedly close to securing funding that would value the company at $3 billion—even before releasing a product. 

Poolside aims to develop artificial intelligence that can write software code, potentially changing how software is developed. This substantial valuation reflects strong investor confidence in the company's vision to automate programming tasks using AI. Despite not having a market-ready product yet, Poolside's approach highlights the growing interest in AI solutions that can handle complex jobs like coding.

Legal AI Startup EvenUp In Talks to Raise at $1 Billion Valuation

EvenUp, a San Francisco-based startup, develops artificial intelligence software that assists personal injury lawyers in preparing claims by analyzing medical records and case files. The company is reportedly in talks to raise new funding at a $1 billion valuation, which would double its worth since its last funding round led by Lightspeed Venture Partners. 

This would mark EvenUp's third funding round in just over a year, bringing its total funding to about $100 million. Current investor Bain Capital Ventures may lead the new investment. The founders declined to comment on the ongoing discussions. EvenUp aims to reduce costly paperwork for lawyers using AI.

🧠RESEARCH

Current AI models perform well in some areas but struggle with real-world data science tasks. DSBench offers 540 tasks from actual competitions, involving complex data and situations. Tests show AI solves only about a third, highlighting the need to develop smarter, more independent data science agents.

Windows Agent Arena is a testing environment where AI agents perform tasks in a real Windows operating system using standard applications and tools. It includes over 150 diverse tasks requiring planning and understanding. Their agent, Navi, succeeded in 19.5% of tasks, highlighting the need for better AI agents in complex environments.

AI agents often fail at complex tasks requiring many steps. Humans solve such tasks by reusing learned routines. The authors introduce Agent Workflow Memory (AWM), which helps AI agents learn and reuse task workflows. Applied to web navigation benchmarks, AWM significantly improves success rates and efficiency, showing better generalization across tasks and domains.

INTRA is a method that helps AI understand how objects can be used without needing detailed image labels. Instead of relying on paired datasets, INTRA learns from regular images using comparison techniques. It outperforms previous methods and works well with new objects and actions.

Large AI models struggle with tasks needing complex reasoning and structured data. The authors introduce Source2Synth, a method to teach these models new skills without costly human annotations. It generates synthetic data grounded in real sources and filters out poor-quality examples, significantly improving performance on advanced question-answering tasks.

🛠️TOP TOOLS

NotebookLM - Personalized AI research assistant powered by Google's most capable model, Gemini 1.5 Pro. 

TheyDo - Enables your business to align strategy, planning, and delivery around journeys: your #1 customer-facing product.

Filmora - Transform your video into a piece of art with simple drag & drop interface and powerful editing tools!

Sierra - Identify and source top inbound & outbound candidates with Serra's AI-powered search engine.

VideoGen - Generate videos in seconds with AI

📲SOCIAL MEDIA

🗞️MORE NEWS

The White House met with industry leaders to keep the U.S. leading in artificial intelligence (AI). They announced plans to support building big computer centers for AI, like creating a special team, easing permissions, and reusing old coal sites. Industry leaders pledged cooperation and to use clean energy.

Fei-Fei Li, a leading artificial intelligence researcher known as the "godmother of AI," has raised $230 million for her startup, World Labs. The company aims to develop AI that understands how the three-dimensional physical world works—called "spatial intelligence"—which could advance virtual reality and robotics. She will continue her work at Stanford University while building the startup.

Salesforce has launched Agentforce, a new set of easy-to-use tools that let companies create AI agents capable of thinking and acting on their own in sales, service, marketing, and commerce. These agents go beyond chatbots by making decisions and completing tasks without needing human help.

GitHub tested OpenAI's new AI model, o1-preview, in their coding assistant Copilot. The model has better problem-solving skills, helping programmers improve complex code and quickly fix performance issues. GitHub plans to use it to make developers' work easier and is excited about its future possibilities.

Oprah hosted an AI special with Sam Altman and Bill Gates to discuss AI's impact and risks. Altman stressed the need for safety testing in AI development. Gates highlighted AI's potential in education and healthcare. FBI Director Christopher Wray warned about AI's misuse in deepfakes and online scams. 

OpenAI's new o1 AI models need simple, direct prompts due to their advanced thinking abilities. Users should avoid detailed guidance and step-by-step prompts. Clear markers like quotation marks help the model understand sections. Adding too much extra information can confuse it. This means prompt writing techniques will need to change.

What'd you think of today's edition?

Login or Subscribe to participate in polls.

Learn AI with us.

Let’s Build the Future Together.

Hello fellow AI-obsessed traveler,

Over the past 2 years, as we’ve grown to over 250,000 subscribers between the YouTube Channel and this newsletter, we've received an overwhelming number of requests for one specific thing.

While the newsletter helps keep you up to speed with AI news, many of you have asked for the next step: to learn how to actually apply AI in your work.

Today we’re finally announcing the solution with NATURAL 20, the community for like-minded AI learners. As a loyal newsletter reader you are getting access at the lowest price it will ever be:

 JOIN NATURAL 20 AI UNIVERSITY TODAY

What you get:

* Tutorials by experts across various AI fields.

* Daily tutorials by Wes Roth about the latest use cases.

* Building Autonomous AI Agents to Automate Your Life and Business (NEW!)

* A network of the top 1% of early AI adopters.

* Access to community-only resources and software.

* And many more features rolling out soon.

Reply

or to participate.