- NATURAL 20
- Posts
- Google’s New AI Tools
Google’s New AI Tools
PLUS: OpenAI & Microsoft Fund Harvard’s Dataset, Google’s Android XR Brings AI to AR and more.
Cold Email Setup Offer
We started sending 10,000 cold emails per day, and scaled a brand new B2B offer to $108k MRR in 90 days. Now, you can have the same system set up (completely done-for-you) inside your own business - WITHOUT going to spam, spending thousands of dollars, or any manual input. Close your next 20 clients easily. We’ll set up the tech, write your scripts, give you the leads, give you the inboxes, and the sending tool - all starting at $500/mo.
Today:
Google’s New AI Tools
ChatGPT Now Supports Real-Time Video
Anthropic Launches Fastest AI Model
OpenAI & Microsoft Fund Harvard’s Dataset
Google’s Android XR Brings AI to AR
Lambda Launches Low-Cost AI Inference API
Google's MASSIVE AI RELEASES | Gemini 2.0, AI Agents, AI Gaming, Spatial Reasoning, Astra and MORE!
Google recently unveiled several innovative projects. Astra, their enhanced AI Voice Assistant, now offers live coaching for video games, step-by-step cooking guidance, and can even evaluate your cooking techniques using camera input. Project Mariner is a smart browser agent that navigates the web in real time to perform tasks like updating spreadsheets automatically.
Gemini 2.0 introduces advanced AI for gaming assistance, helping with strategies and reminders, and can also create and modify images based on user prompts. Additionally, Google is developing spatial understanding tools that recognize and interact with objects in images and videos, paving the way for more interactive and intuitive applications.
OpenAI has launched real-time video features for ChatGPT, seven months after its initial demo. This update, called Advanced Voice Mode with vision, allows ChatGPT to respond instantly when users point their phones at objects. It can also understand and explain what’s on a device’s screen, helping with tasks like navigating settings or solving math problems.
Available to ChatGPT Plus, Team, and Pro subscribers, the feature starts rolling out this week but won’t be available to all users immediately. OpenAI also introduced a fun “Santa Mode,” adding Santa’s voice to ChatGPT. Competitors like Google and Meta are developing similar technologies.
Anthropic has launched its new AI model, Claude 3.5 Haiku, for users of its chatbot platform, Claude. This model performs as well or better than the previous top model, 3 Opus, especially in areas like helping with computer programming, organizing data, and managing content.
Claude 3.5 Haiku can produce longer responses and includes more recent information. However, it cannot analyze images, unlike some other models from Anthropic. The release faced some issues when Anthropic increased the cost after initially promising the same price as the older model, causing minor controversy.
Harvard University is releasing a large, free collection of nearly one million public-domain books to help anyone build AI tools. Funded by OpenAI and Microsoft, this dataset includes classics like Shakespeare and Dickens, as well as specialized texts in various languages. Created by Harvard’s Institutional Data Initiative, the project aims to make high-quality resources available to smaller companies and individual researchers, leveling the playing field with big tech firms.
While the dataset focuses on text and doesn’t include images, it complements other efforts to provide accessible AI training materials. This move supports ethical AI development and reduces reliance on copyrighted works.
Google has introduced Android XR, a new operating system for augmented and virtual reality (AR/VR) devices like headsets and smart glasses. Developed with Samsung and Qualcomm, Android XR aims to make interacting with technology more natural using artificial intelligence (AI). The platform allows users to switch between virtual environments and the real world seamlessly. It includes an AI assistant named Gemini, which helps with tasks like planning and research.
Android XR supports popular Google apps, enabling features like watching YouTube on a virtual screen and using Google Maps in 3D. The first device, called Project Moohan, will be available next year, with more devices and apps expected to follow.
Lambda Labs has launched the Lambda Inference API, offering the lowest-cost service for deploying artificial intelligence (AI) models. This new service allows businesses to use AI applications without needing to buy or manage powerful computer hardware. The API supports popular AI models and charges based on usage, starting at just $0.02 for every million words processed.
Lambda’s extensive experience with graphics processing units (GPUs) helps keep costs low. Aimed at both startups and large companies, the platform makes AI accessible and easy to use. In the future, Lambda plans to add features for video and image-based AI tools.
🧠RESEARCH
SynCamMaster, a method for generating consistent multi-camera videos from diverse viewpoints, addressing challenges in virtual filming. It enhances a pre-trained text-to-video model with a synchronization module to maintain visual consistency. The approach uses a hybrid training method and includes a new synchronized video dataset for research.
LAION-SG, a large-scale dataset with detailed scene graph annotations for better text-to-image generation. It addresses the challenge of creating complex images with multiple objects and relationships. Using this dataset, the authors train the SDXL-SG model, which significantly improves performance in generating intricate scenes. They also propose CompSG-Bench, a new benchmark for evaluating compositional image generation.
POINTS1.5, a vision-language model designed for real-world applications. It improves on POINTS1.0 with a dynamic high-resolution vision encoder, bilingual support (especially for Chinese), and advanced filtering methods for training datasets. POINTS1.5 outperforms its predecessor and excels in various tasks, ranking first among models with under 10 billion parameters.
StyleMaster, a video stylization model that improves upon existing methods by enhancing both style consistency and temporal coherence. It incorporates local texture features and uses a motion adapter to adapt image models to video. StyleMaster outperforms other models in generating high-quality stylized videos that closely match the reference style and content.
StyleStudio, a text-driven style transfer method that improves style alignment and control. It introduces three innovations: a cross-modal AdaIN mechanism for better integration of style and text, a Style-based Classifier-Free Guidance for selective stylistic control, and a teacher model to stabilize generation. These enhancements significantly improve style transfer quality and alignment with text.
🛠️TOP TOOLS
PicFinder AI - AI-powered image generation tool that transforms textual descriptions into unique visual content.
Unboring AI - AI-powered online platform that offers innovative tools for face swapping, photo animation, and video restyling, allowing users to create engaging and entertaining content with just a few clicks.
Crayo - Transform simple text inputs into viral-ready videos.
Dubverse AI - AI-powered platform that revolutionizes video dubbing and subtitling, offering creators and businesses the ability to make their content multilingual instantly across 30+ languages with human-like AI voices.
Bing Create - AI-powered tool that transforms text descriptions into visual art, offering users the ability to generate unique images from their ideas.
📲SOCIAL MEDIA
Google broke the Internet with mind blowing Gemini 2.0 Real-Time demos yesterday.
And people are already doing wild use cases with it.
10 examples:
1. Gaming will never be the same
— Min Choi (@minchoi)
5:19 PM • Dec 12, 2024
🗞️MORE NEWS
Character.AI updated its chatbots to protect teens by using separate versions that avoid romantic and sensitive topics. They’re adding parental controls, blocking harmful content, and directing users to help lines after lawsuits and safety concerns.
Spanish startup Maisa raised $5 million to make AI reliable for businesses. Instead of just providing answers, Maisa’s system shows each step, helping companies understand and trust AI results. This approach reduces errors and increases transparency.
Researchers at the University of Sydney created SwagBot, an AI-powered robot that herds cattle smartly. It checks pasture and animal health, moves cattle to better areas, prevents land damage, and helps farmers manage their farms more sustainably and efficiently.
Reddit introduces "Reddit Answers," an AI feature that uses chatbots to provide summaries from its own posts. Unlike other AI, it only accesses Reddit content, ensuring answers are based on real user discussions. This enhances user trust and engagement.
Google’s AI, Gemini, can now summarize entire Drive folders. By clicking “Summarize this folder,” users receive an overview of documents and images inside. This feature helps organize files easily and is available to premium and business subscribers.
YouTube is adding AI-generated replies for creators to respond to comments easily. However, these automated responses often become confusing, incorrect, or oddly personal. Creators like Clint Basinger criticize them for harming genuine interactions and trust.
Nvidia is hiring 200 additional employees in China to boost its AI-driven car research, bringing its Beijing team to about 600. This growth happens despite a Chinese anti-monopoly investigation and a drop in China’s revenue from 26% to 17%.
What'd you think of today's edition? |
Reply