- NATURAL 20
- Posts
- Apple's ReALM AI Outperforms GPT-4
Apple's ReALM AI Outperforms GPT-4
PLUS: Stable Audio 2.0 Launches, Resemble's Rapid Voice Tool and more.
Today:
Apple's ReALM AI Outperforms GPT-4
Top Talent Joins Google
New DALL-E Editing Options
Stable Audio 2.0 Launches
Brave Expands AI to iOS
Resemble's Rapid Voice Tool
š Apple's new AI model outperforms GPT-4 | Is Apple Secretly Building AI Agents?
Apple, traditionally behind in AI, makes strides with screen context understanding. Scrapping car project, focus shifts to AI, hinting at improved Siri and M3 chips. Notably, they unveiled an AI that understands screen context, hinting at breakthroughs in multimodal AI.
Their research paper introduces Reference Resolution for language modeling, potentially surpassing GPT-4. Apple's focus on on-device AI agents could revolutionize user interactions.
Google just scored a big win in the AI talent war
Google snags Logan Kilpatrick, ex-OpenAI head, for its AI Studio, marking a victory in the AI talent battle. Kilpatrick's move underscores Big Tech's scramble for top AI minds. His expertise in developer relations, dubbed Google's "secret weapon," is crucial for AI integration.
Excited to share Iāve joined @Google to lead product for AI Studio and support the Gemini API.
Lots of hard work ahead, but we are going to make Google the best home for developers building with AI.
Iām not going to settle for anything less.
ā Logan Kilpatrick (@OfficialLoganK)
6:04 PM ā¢ Apr 2, 2024
Microsoft's recent hire of Mustafa Suleyman highlights the competitive landscape. Kilpatrick's departure from OpenAI triggered gratitude from developers, emphasizing his impact. Google's aggressive recruitment tactics, including personal appeals from Mark Zuckerberg and Sergey Brin, reflect the industry's intensity. High compensation and immediate access to resources are key in attracting talent, with some salaries reaching $1 million.
Paid ChatGPT customers can now use AI to edit DALL-E images
OpenAI now lets paying users tweak DALL-E images through ChatGPT prompts. This move simplifies image refinement, a task previously challenging. By leveraging ChatGPT's linguistic prowess, users can describe edits instead of navigating complex tools. Demonstrated on X, OpenAI showcased adding bows to a poodle's ears in a DALL-E image.
Moreover, DALL-E introduces options to select aspect ratios and apply styles like "motion blur" or "solarpunk." Notably, these features are currently exclusive to paid users. This integration of language for editing could revolutionize various software domains, spanning video, audio, and image editing. AI-generated images also raise significant implications.
Stability AI brings new clarity and power to gen AI audio with Stable Audio 2.0
Stability AI unveils Stable Audio 2.0, expanding generative AI capabilities beyond text-to-image. This upgrade enables users to create high-quality audio tracks up to 3 minutes long and supports audio-to-audio generation. Zach Evans, head of audio research, highlights improvements in musicality and response accuracy to detailed prompts.
Stable Audio 2.0 leverages latent diffusion technology, offering complete musical compositions with distinct sections. The model, trained on licensed data from AudioSparx, prioritizes copyright protection. While not openly available, Stable Audio 2.0 aims for future openness. The release follows Stability AI's former CEO's resignation, signaling business continuity and commitment to innovation.
Brave is launching its AI assistant on iPhone and iPad
Brave introduces Leo, its AI assistant, on iPhone and iPad, expanding its functionality beyond Android and desktop. Leo offers voice-to-text capability, facilitating hands-free interaction. It can summarize pages, answer questions, generate reports, translate text, transcribe audio/video, and even write code.
Brave aims to provide an all-in-one AI assistant, reducing reliance on other services like ChatGPT. Leo accesses various AI models and offers a premium option for enhanced features. Users can enable Leo via Brave's browser settings. This launch follows other browser companies like Opera, which introduced its AI assistant, Aria, last year.
Resemble AI launches tool to make AI voice clones in a minute
Resemble AI launches Rapid Voice Cloning, a breakthrough tool that swiftly generates voice clones with minimal data input, revolutionizing the process. Unlike traditional methods requiring lengthy recordings, Rapid Voice Cloning needs just 10 seconds to 1 minute of clear audio for replication. It excels in capturing accents and nuances, enabling diverse applications in content creation, accessibility, and personalization.
While initial tests show some limitations, Resemble AI aims to support various English accents soon. This innovation streamlines voice cloning for content creators and businesses, enhancing user experiences. Competitors like ElevenLabs offer similar solutions, but Resemble's approach boasts accessibility and speed, with pricing plans starting at $29/month.
š§ RESEARCH
Eurus, an upgraded large language model (LLM) geared towards reasoning, outperforms GPT-3.5 Turbo across various benchmarks, boasting a 33.3% pass@1 accuracy on LeetCode and 32.6% on TheoremQA. Its success is attributed to UltraInteract, a dataset aiding supervised fine-tuning and preference learning, unveiling insights for improved reasoning models.
Octopus v2 introduces an on-device language model tailored for AI agents, addressing concerns over privacy and cost associated with cloud-based models. With 2 billion parameters, it outperforms GPT-4 in accuracy and latency, reducing context length by 95% and latency by 35-fold compared to Llama-7B, making it deployable across edge devices for real-world applications.
LLaVA-Gemma explores compact language models for accelerating multimodal foundation models (MMFM) within the LLaVA framework. Testing various design features, including pretraining the connector and adjusting image and language backbone sizes, the Gemma-based models show moderate performance but fall short of outperforming current state-of-the-art models of comparable size.
Long-context LLMs face hurdles in comprehending extensive sequences, particularly in extreme-label classification tasks. A specialized benchmark, LIConBench, highlights these challenges by assessing 13 models on datasets with varying label ranges and input lengths. While models excel under 20K tokens, performance sharply declines beyond, revealing limitations in processing and understanding lengthy, context-rich sequences.
Researchers analyze latent diffusion models (LDMs) to understand their scaling properties and sampling efficiency. Contrary to expectations, smaller models often outperform larger ones within a fixed inference budget across text-to-image tasks. This surprising trend suggests potential avenues for optimizing LDMs to improve generative capabilities under resource constraints.
š ļøTOP TOOLS
Freepik - Create endless variations and styles from any image.
Vapi - Build, test and deploy voicebots in minutes rather than months.
Mentor - AI powered goal management
Aqua Voice - voice-driven document editor that transcribes your voice into written text accurately and efficiently.
Clay - Combine 50+ data sources, web scraping, and AI messaging to enrich your data and automate your outbound at scale.
š²SOCIAL MEDIA
You can now edit DALLĀ·E images in ChatGPT across web, iOS, and Android.
ā OpenAI (@OpenAI)
5:00 PM ā¢ Apr 3, 2024
šļøMORE NEWS
Hackers force AI chatbots to break their own rules
Hackers exploit human tricks to make AI chatbots break rules, per DEF CON findings. About 15.5% of chats manipulated bots to spill sensitive data or evade safeguards, with 9.8% success via "You are a" prompts. Popular chatbots like ChatGPT are vulnerable; OpenAI's move to skip account creation adds more risk. AXIOS
Cloudflare makes it simple to deploy AI apps with Hugging Face, launches Workers AI to public
Cloudflare simplifies AI app deployment with Hugging Face integration, launching Workers AI globally. CEO Matthew Prince highlights AI's challenge in production, offers cost-effective solution. Developers select open-source models via Hugging Face, deploy instantly to Workers AI for global access. Improved AI supports fine-tuned model weights, catering to domain-specific needs. VENTUREBEAT
Former Snap AI chief launches Higgsfield to take on OpenAIās Sora video generator
Former Snap AI chief launches Higgsfield to rival OpenAI's Sora video generator. Higgsfield's Diffuse app creates personalized videos from text or selfies, targeting diverse creators. Its mobile-first approach aims at ease of use. Higgsfield, with lean operations and $8M funding, plans improved video editor and social media-focused models. TECHCRUNCH
Opera browser dev branch rolls out support for running LLMs locally
Opera integrates experimental support for running large language models (LLMs) locally in its developer version, enabling access to 150 LLM variants from 50 families like LLaMA and Gemma. Unlike internet-dependent alternatives, Opera's approach ensures data privacy, although storage requirements and speed limitations pose challenges. Opera aims to evolve this feature through its AI Feature Drop Program but hasn't specified a timeline for its mainstream release. THE REGISTER
Anthropic researchers detail how āmany-shot jailbreakingā can manipulate AI responses
Anthropic researchers expose a method called "many-shot jailbreaking" that exploits AI's expanded context windows to manipulate responses, potentially causing harm. While it enhances AI's utility, it also makes it vulnerable to manipulation. Researchers advocate for awareness and mitigation strategies while questioning the focus on censorship over addressing actual concerns. SILICON ANGLE
Former senior Intel exec raises $24 million in Seed funding for AI-powered video security platform
Lumana, a video security startup, secures $24 million in Seed funding. Its AI-driven platform analyzes real-time video to enhance security and operational efficiency. Founded by former Intel exec Sagi Ben Moshe, Lumana aims to tap into the booming video analytics market, projected to reach $38 billion by 2030. CTECH
An AI Stethoscopeās New Algorithm To Predict Heart Failure Gets FDA Clearance
Eko Health's AI stethoscope, with FDA clearance for detecting low ejection fraction, aims to revolutionize heart disease diagnosis. The algorithm, co-developed with Mayo Clinic, enables early detection, potentially saving lives. Eko, backed by $125M in funding, plans to roll out the technology to primary care physicians, enhancing preventive care. FORBES
AI Finds Personality Shapes Genes
The study reveals that personality traits influence the expression of 4,000 genes, impacting health and well-being. A network of genes related to personality inheritance was identified, along with a control hub of six genes regulating emotional processing. Cultivating a self-transcendent outlook on life may improve health by regulating gene expression. NEUROSCIENCE NEWS
FDA Approves AI Tool That Can Detect Sepsis
The FDA approved an AI tool by Prenosis that diagnoses sepsis, a life-threatening infection response. Using 22 parameters, including biomarkers and vital signs, it predicts sepsis risk within 24 hours. With over 100,000 patient samples, it aims to reduce the 350,000 annual sepsis-related deaths or hospice cases in the US. FORBES
What'd you think of today's edition? |
What are MOST interested in learning about AI?What stories or resources will be most interesting for you to hear about? |
Reply