NATURAL 20
Posts
AI’s Autonomous Capabilities Today

AI’s Autonomous Capabilities Today

PLUS: GitHub Copilot for Project Starts, Activists Challenge OpenAI on GDPR Breach and more.

Wes Roth
April 30, 2024

Today:

AI’s Autonomous Capabilities Today
Financial Times Partners with OpenAI
DeepMind Unveils Gecko
GitHub Copilot for Project Starts
Activists Challenge OpenAI on GDPR Breach
Meta’s Ad Tool Causes Cash Bleed

STUNNING Step for Autonomous AI Agents PLUS OpenAI Defense Against JAILBROKEN Agents

The rapid advancement of AI technology, particularly in large language models and AI agents, promises significant changes in how we interact with computers. Currently, advancements in AI's ability to reason, perceive, and execute computer-based tasks have progressed markedly. For example, AI can now perform various tasks that were typically done by humans, such as coding, data entry, and even creative tasks like writing, all by mimicking human interactions with computers.

New benchmarks like OSWorld are being developed to evaluate AI agents' effectiveness across different operating systems like Linux, Microsoft, and Apple. These agents are not just theoretical but are becoming more capable of performing complex tasks autonomously, indicating a shift towards more integrated and potentially transformative uses of AI in everyday computing.

WATCH THE VIDEO ON YOUTUBE

Financial Times announces strategic partnership with OpenAI

The Financial Times has partnered with OpenAI, a leader in AI research, to enhance ChatGPT with FT content and develop new AI products. Users can expect attributed summaries and links to FT journalism in ChatGPT responses. FT also adopted ChatGPT Enterprise for its employees.

CEO John Ridding sees this as pivotal for understanding AI's impact on content access. OpenAI's COO, Brad Lightcap, highlights the goal of empowering news organizations and enriching ChatGPT with quality journalism. The FT aims to navigate AI advancements while safeguarding content integrity.

FINANCIAL TIMES

Google’s DeepMind creates ‘Gecko’, a rigorous new standard for testing AI image generators

Google DeepMind introduces "Gecko," a rigorous benchmark for evaluating text-to-image AI models. Current evaluation methods lack comprehensiveness and reliability, prompting the need for Gecko. It assesses models across 2,000 text prompts, categorizing skills and complexity levels. Gecko gathers extensive human ratings to discern model limitations, ambiguous prompts, and evaluation inconsistencies.

Additionally, it features an enhanced automatic evaluation metric aligned with human judgments. DeepMind's Muse model excels in Gecko's evaluation. The researchers stress the importance of diverse benchmarks for accurate understanding and deployment of text-to-image AI. Gecko's code and data will be freely available to facilitate further advancements.

VENTUREBEAT

GitHub Copilot can now help start a project with AI, not just complete it

GitHub introduces GitHub Copilot Workspace, extending its AI-powered code completion platform to assist developers in starting new coding projects. This service aims to reduce the time spent reading through code and figuring out project beginnings. Integrated into GitHub repositories, Copilot Workspace allows developers to describe project goals, receive suggestions on project initiation, edit suggestions, and run the code.

GitHub Next head Jonathan Carter highlights Copilot Workspace's ability to review older code efficiently. The expansion of Copilot's capabilities underscores its significance as a benchmarking skill for new AI models.

THE VERGE

Data Activists Target OpenAI In Challenge To ChatGPT’s ‘Hallucination’ Problem

Privacy activists from the Vienna-based nonprofit noyb have filed a complaint against OpenAI, alleging violations of Europe's General Data Protection Regulation (GDPR) due to the AI's tendency to produce incorrect information ("hallucinate"). They argue that OpenAI's inability to accurately filter false data from its outputs, such as incorrect birth dates for public figures, violates GDPR.

Noyb also criticized OpenAI for its lack of transparency regarding the data processed, its sources, and its dissemination. This legal action puts additional pressure on AI developers to comply with stringent EU privacy regulations, amidst ongoing concerns about AI-generated misinformation and its potential consequences.

FORBES

Meta’s ‘set it and forget it’ AI ad tools are misfiring and blowing through cash

Facebook's transition to Meta — in 3D. More 3D app icons like these are coming soon. You can find my 3D work in the collection called "3D Design".

Meta's automated ad platform, Advantage Plus, has been malfunctioning, causing advertisers to experience significantly higher costs and poor performance. Since its malfunction began on Valentine's Day, the system has been consuming daily budgets rapidly, with costs per impressions (CPMs) increasing tenfold, yet yielding almost no revenue. This issue has led to substantial financial losses for small businesses using the service, and many are considering abandoning Meta's platforms due to the lack of transparency and accountability from the company.

Despite Meta's claims that Advantage Plus is a "set it and forget it" solution, the reality has been inconsistent and problematic, forcing some advertisers to revert to manual ad purchases. Meta has acknowledged a platform bug on February 14 but has not provided detailed explanations or effective solutions, raising frustrations among its users.

THE VERGE

🧠RESEARCH

PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning

The new Pooling LLaVA model adapts existing image-language tech to videos, cutting down on the need for massive data and computing power. By adjusting how features change over time, it beats previous top scores in video understanding and answering, showing major improvements on several benchmarks.

AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs

The AdvPrompter speeds up the creation of adversarial prompts for Large Language Models (LLMs), making it super quick to test LLM vulnerabilities. Unlike slow, traditional methods, it uses a cool new approach to generate prompts that trick LLMs without needing deep access to their internals. This tool not only shows top results on test datasets but also helps make LLMs tougher against attacks.

MaPa: Text-driven Photorealistic Material Painting for 3D Shapes

The MaPa project brings a fresh approach to creating photorealistic materials for 3D models using text descriptions. It ditches the need for massive matched datasets and instead uses a 2D diffusion model to make material graphs for each part of a model. These graphs are then tweaked to match the text, resulting in highly realistic and editable materials that outdo previous methods.

HaLo-NeRF: Learning Geometry-Guided Semantics for Exploring Unconstrained Photo Collections

HaLo-NeRF introduces a way to navigate and understand big tourist spots through photos, using a mix of language and visual cues. It bridges the gap between simple 3D mapping and deep semantic understanding by linking scene images to descriptive text. This method improves recognition of landmarks by aligning visual data with relevant internet text, enhancing both 3D segmentation and semantic accuracy over existing models.

NeRF-XL: Scaling NeRFs with Multiple GPUs

NeRF-XL revolutionizes the use of Neural Radiance Fields (NeRFs) by enabling their distribution across multiple GPUs, which allows for handling much larger models and scenes. This new method overcomes limitations of previous multi-GPU strategies that struggled with scene quality, offering a scalable approach that maintains the integrity of a single-GPU setup while reducing cross-GPU communication. It has proven to enhance both the speed and quality of 3D reconstructions on extensive datasets, showcasing its potential with impressive results on the massive MatrixCity dataset.

🛠️TOP TOOLS

Idefics2 - A Powerful 8B Vision-Language Model for the community

Collectif - Where teams discover product insights in Minutes, not Days

Faune - Brings the magic of AI models like Mistral and GPT-4 directly to you for free, prioritizing your privacy every step of the way.

Jack AI - AI powered marketing tools helps you write and edit high quality marketing content.

Browse AI - Simplifies data extraction from multiple sources by allowing businesses to train no-code scraping robots

📲SOCIAL MEDIA

Memory is now available to all ChatGPT Plus users. Using Memory is easy: just start a new chat and tell ChatGPT anything you’d like it to remember.
Memory can be turned on or off in settings and is not currently available in Europe or Korea. Team, Enterprise, and GPTs to come.
— OpenAI (@OpenAI)
5:07 PM • Apr 29, 2024

🗞️MORE NEWS

Copilot Workspace is GitHub’s take on AI-powered software engineering

GitHub's Copilot Workspace is leveling up AI-powered software development tools, enabling devs to craft code using natural language. This new tool aims to ease the starting process and assist throughout code building, particularly enhancing GitHub's current offerings. It's still in technical preview, exploring ways to blend AI with human creativity more efficiently. TECHCRUNCH

Google Releases Major New Feature Boost To Older Android Phones

Google's rolling out its AI app, Gemini, to older Android phones—Android 10 and 11—expanding accessibility to millions. This update, initially for newer models, marks a significant step in broadening AI tech adoption across various devices. As Google gears up for the I/O Conference, expect more insights into Gemini and future AI integrations. FORBES

These Apple apps could get a major AI overhaul in iOS 18

iOS 18 is shaping up to be a monumental update for Apple, potentially featuring major AI improvements across several key apps like Notes, Mail, Photos, and Fitness. The update could include a more intuitive Siri, advanced mapping features in Apple Maps, and a smarter Notes app capable of performing mathematical calculations. As anticipation builds, the full extent of these upgrades is expected to be unveiled at Apple's Worldwide Developers Conference on June 10. TECHRADAR

‘AI death calculator’ creators issue urgent warning about frighteningly accurate tool

The creators of Life2vec, a precise AI-based death prediction tool, have issued a warning about fraudulent apps exploiting their technology to steal personal information. Developed by US and Danish scientists, Life2vec uses factors like income, profession, and health history to predict mortality. However, numerous unauthorized apps, dubbed "de-terminators," have surfaced, mimicking Life2vec and potentially harvesting sensitive user data. The original creators emphasize that these counterfeit apps are not associated with their genuine, research-focused product. THE NEW YORK POST

Image-generating AI creates uncanny optical illusions

AI models can now create optical illusions from text descriptions, adjusting to produce images that change appearance based on viewing angle or motion. Researchers at the University of Michigan developed these illusions by tweaking existing AI technology, creating visuals that morph when seen from different perspectives. NEWSCIENTIST

AI-enabled electrocardiography alert intervention and all-cause mortality: a pragmatic randomized clinical trial

In a comprehensive study, an AI-enabled electrocardiogram (ECG) significantly reduced all-cause mortality within 90 days by identifying high-risk hospitalized patients. The multisite trial involved 39 physicians and 15,965 patients, showing that the AI-driven intervention could prompt timely and intensive care, especially for those identified with high-risk ECGs, leading to a marked decrease in cardiac deaths. NATURE

Researchers develop malicious AI ‘worm’ targeting generative AI systems

Researchers have developed a new malware named "Morris II," designed to exploit generative AI systems like OpenAI's ChatGPT and Google's Gemini. This malware, which resembles the infamous 1988 Morris worm, uses self-replicating prompts to spread, steal data, and generate spam through AI-powered platforms. Demonstrated in a controlled environment, it reveals significant vulnerabilities in AI systems, underscoring the urgent need for enhanced AI cybersecurity measures. SECURITY INTELLIGENCE

Meditron: An LLM suite especially suited for low-resource medical settings leveraging Meta Llama

Meditron, developed using Meta Llama 2, is a suite of large multimodal foundation models tailored for the medical field, enhancing clinical decision-making and diagnosis. Built collaboratively with Yale School of Medicine and EPFL, it has quickly adapted to incorporate Meta Llama 3, significantly improving its performance on medical benchmarks. Aimed at low-resource settings, Meditron provides open-source, evidence-based medical information, helping fill crucial gaps in global healthcare access. The suite includes models of different scales, maintaining high quality while optimizing for cost and technical feasibility, and has engaged a global community in its ongoing evaluation and refinement. META