• NATURAL 20
  • Posts
  • Google released Gemma 2, Gemini 1.5 Flash and Gemini 1.5 Pro

Google released Gemma 2, Gemini 1.5 Flash and Gemini 1.5 Pro

PLUS: SoftBank Backs Perplexity AI, Andrew Ng's AI Fundraising and more.

Today:

  • Google released Gemma 2, Gemini 1.5 Flash and Gemini 1.5 Pro

  • OpenAI postponed ChatGPT's Voice Mode, developed CriticGPT

  • SoftBank Backs Perplexity AI

  • Andrew Ng's AI Fundraising

  • Music Labels Sue AI Startups

  • YouTube Negotiates Music Licensing

GOOGLE

  • Google launched Gemma 2, a new AI model series, with 9-billion and 27-billion parameter versions. Available next month on Vertex AI, these lightweight models are designed for easy integration into apps and devices. They offer improved performance and efficiency compared to previous versions, targeting developers who want to add AI capabilities to various projects

  • Google has released two versions of its Gemini AI model to the public: Gemini 1.5 Flash and Gemini 1.5 Pro. Flash is faster and cheaper, while Pro has a huge 2 million token context window. This allows the models to process massive amounts of text, audio, and video. Google aims to help businesses create powerful AI tools with these updates

  • Google has released Imagen 3, its latest text-to-image AI model, on Vertex AI for select customers. This new version offers faster image generation, better prompt understanding, improved people rendering, and enhanced text control. It includes multi-language support, safety features, and multiple aspect ratios. Shutterstock is already using Imagen 3 to generate millions of images

  • Google Cloud is expanding its Vertex AI platform with new generative AI models and features. The Model Garden now includes over 100 large language models, including third-party options like Meta's Llama 2 and Anthropic's Claude 2. Google has also improved its own models like PaLM 2 and added new capabilities for customization, real-time data integration, and image generation.

OPENAI

  • OpenAI has postponed the launch of ChatGPT's Voice Mode, originally set for June. The delay is due to ongoing work on content detection, user experience improvements, and infrastructure scaling. The feature will now start with a small test group in July, with a full release to paying users expected in autumn, pending safety and reliability checks.

  • OpenAI has developed CriticGPT, an AI based on GPT-4, to identify errors in ChatGPT’s outputs. When human trainers use CriticGPT, they catch more errors, improving accuracy and reducing hallucinations. This AI tool enhances the trainers' ability to critique ChatGPT responses, aiming for better alignment and evaluation of advanced AI systems.

  • OpenAI acquired Multi, a video-based remote collaboration platform. The deal, for an undisclosed amount, brings Multi's five-person team to OpenAI. Multi will shut down on July 24, 2024. This purchase aligns with OpenAI's focus on enterprise solutions, following its recent acquisition of Rockset and its growing corporate ChatGPT user base

SoftBank to invest in search startup Perplexity AI at $3 bln valuation, Bloomberg reports

SoftBank is investing $73.6 million in Perplexity AI, a search startup, valuing the company at $3 billion. Perplexity AI offers an AI-powered search engine that aims to provide direct answers to queries, competing with traditional search engines like Google. The company has gained popularity, especially among younger users, and has raised $100 million in previous funding rounds. 

This investment comes as part of a larger trend of increased funding in AI startups, with SoftBank being a major player in the field. Perplexity AI's approach to search, which focuses on delivering concise answers rather than lists of links, has attracted attention in the rapidly evolving AI landscape.

Andrew Ng plans to raise $120M for next AI Fund

Andrew Ng’s AI Fund is raising $120 million for its second fund, AI Venture Fund II. So far, $69.75 million has been secured from 13 partners, leaving $50 million to reach the goal. Ng, known for his work with Google Brain and Coursera, founded the AI Fund in 2018 to support AI startups. Initial backers included Greylock Partners and Sequoia Capital. 

This new fund is smaller than the first but still exceeds initial expectations. The fundraising comes amid a cooling period for AI investments, with early-stage dealmaking in generative AI dropping significantly.

Record Labels Sue Two Startups for Training AI Models on Their Songs

Record labels Universal Music Group, Warner Music Group, and Sony Music Entertainment are suing AI startups Suno and Udio for up to $150,000 per work infringed. The lawsuits, filed by the Recording Industry Association of America, accuse these companies of unlawfully training their AI models on copyrighted sound recordings to automate music creation. 

This legal action highlights the music industry's aggressive stance against generative AI technology that allows easy music generation based on existing songs, aiming to protect their intellectual property rights from unauthorized use.

YouTube tries convincing record labels to license music for AI song generator

YouTube is negotiating with record labels to license songs for an AI tool that generates music, aiming to secure legal access to train its AI models. The company has offered upfront payments to major labels like Sony, Warner, and Universal. 

Despite YouTube's efforts, many artists fear AI-generated music could devalue their work. The new tool, potentially part of YouTube’s Shorts platform, would follow a previous limited trial. The talks come amid broader industry moves, including recent lawsuits against AI start-ups for unauthorized use of copyrighted music to train AI models.

🧠RESEARCH

Adam-mini is a new optimizer that offers comparable or superior performance to AdamW while using 45% to 50% less memory. By optimizing learning rate resources in Adam, it assigns a single efficient learning rate to each parameter block, enhancing efficiency across various models and reducing communication overheads in GPU and CPU interactions.

FineWeb is a large dataset derived from 96 Common Crawl snapshots containing 15 trillion tokens. It aims to improve the performance of large language models like Llama 3 and Mixtral by offering superior quality and size compared to existing open pretraining datasets. The authors detail their curation strategies, including deduplication and filtering techniques, and also introduce FineWeb-Edu, a subset optimized for educational content.

YouDream introduces a novel method for generating high-quality 3D animals controllable via text inputs. Unlike previous approaches limited by text or sourced images, YouDream utilizes a text-to-image diffusion model guided by 2D views of 3D poses. This allows for anatomically consistent creations not achievable with prior text-to-3D methods. The method includes an automated pipeline and demonstrates superior model preferences in a user study.

DreamBench++ introduces a benchmark for personalized image generation aligned with human evaluation. Addressing current shortcomings in automated and human evaluations, it employs advanced multimodal GPT models to systematically design prompts. This approach enhances human and self-alignment, supported by task reinforcement, and includes a diverse dataset. Comparative studies with seven generative models show significantly improved human-aligned evaluations, promising innovative advancements in the field.

Cambrian-1 introduces a family of multimodal language models (MLLMs) focusing on vision-centric design. Addressing gaps in visual representation learning, it explores various architectures and models, including self-supervised and supervised approaches across 20+ vision encoders. The study proposes CV-Bench, a new benchmark, and introduces Spatial Vision Aggregator (SVA) to enhance visual grounding. Cambrian-1 aims to advance multimodal systems with open resources, model weights, code, and comprehensive evaluation strategies.

🛠️TOP TOOLS

Lenso AI - AI-Powered Reverse Image Search

Relay - AI-powered automations with a human in the loop

Albus - Connect your work apps. Ask questions and get instant answers.

Pygma - Personal AI social media manager

Lazy - Build and modify web apps with prompts and deploy to the cloud with 1 click.

📲SOCIAL MEDIA

🗞️MORE NEWS

Thai Energy Billionaire Steps Up Data Center Push to Tap AI Boom

Energy billionaire Sarath Ratanavadi, Thailand’s second-richest person, is investing $271 million to expand his data center business. Gulf Energy Development Pcl, his company, will double the center's capacity in Bangkok, boosting energy consumption to 50 megawatts. This expansion aims to meet the growing demand for cloud computing and AI services. BLOOMBERG

Time Magazine partners with OpenAI and ElevenLabs

Time Magazine is partnering with OpenAI and ElevenLabs to integrate AI into its operations. OpenAI will access Time's content for model training, and ChatGPT will distribute Time's content. ElevenLabs will provide automated voiceovers for articles on Time's website. These collaborations aim to enhance Time's journalism and audience engagement through AI technology. VENTUREBEAT

Microsoft maintains AI services in Hong Kong, as OpenAI curbs API access from China

OpenAI is restricting API access from mainland China and Hong Kong due to national security concerns. Despite this, Microsoft will continue offering OpenAI tools to eligible Hong Kong customers via its Azure cloud platform. This move is expected to push users in China towards domestic AI platforms like Zhipu AI and Baidu. SCMP

Instagram is starting to let some creators make AI versions of themselves

Meta is testing AI Studio on Instagram, allowing creators to develop AI chatbots of themselves for messaging. Mark Zuckerberg announced the rollout, promising AI chatbots labeled as such to interact with users. This move aims to enhance user engagement, though the early stage means improvements are ongoing. THE VERGE

Amazon is reportedly working on its own AI chatbot that might be smarter than ChatGPT

Amazon is developing an AI chatbot named "Metis" akin to ChatGPT, equipped with advanced capabilities like retrieving information beyond its training data. Scheduled for a potential September launch, Metis aims to serve as a versatile assistant, handling tasks from answering questions to automating home functions. TECHRADAR

Reddit escalates its fight against AI bots

Reddit is tightening its stance against AI bots accessing its public data by updating its robots.txt file. This move aims to enforce existing policies requiring licensing deals for commercial use, similar to agreements with Google and OpenAI. This step reflects Reddit's effort to combat unauthorized data scraping and maintain control over how its content is utilized by AI models. THE VERGE

What'd you think of today's edition?

Login or Subscribe to participate in polls.

Learn AI with us.

Let’s Build the Future Together.

Hello fellow AI-obsessed traveler,

Over the past 2 years, as we’ve grown to over 250,000 subscribers between the YouTube Channel and this newsletter, we've received an overwhelming number of requests for one specific thing.

While the newsletter helps keep you up to speed with AI news, many of you have asked for the next step: to learn how to actually apply AI in your work.

Today we’re finally announcing the solution with NATURAL 20, the community for like-minded AI learners. As a loyal newsletter reader you are getting access at the lowest price it will ever be:

 JOIN NATURAL 20 AI UNIVERSITY TODAY

What you get:

* Tutorials by experts across various AI fields.

* Daily tutorials by Wes Roth about the latest use cases.

* Building Autonomous AI Agents to Automate Your Life and Business (NEW!)

* A network of the top 1% of early AI adopters.

* Access to community-only resources and software.

* And many more features rolling out soon.

Reply

or to participate.