The address bar/ChatGPT input window in OpenAI's browser ChatGPT Atlas "could be targeted for prompt injection using malicious instructions disguised as links," reports SC World, citing a report from AI/agent security platform NeuralTrust:
NeuralTrust found that a malformed URL could be crafted to include a prompt that is treated as plain text by the browser, passing the prompt on to the LLM. A malformation, such as an extra space after the first slash following "https:" prevents the browser from recognizing the link as a website to visit. Rather than triggering a web search, as is common when plain text is submitted to a browser's address bar, ChatGPT Atlas treats plain text as ChatGPT prompts by default.
An unsuspecting user could potentially be tricked into copying and pasting a malformed link, believing they will be sent to a legitimate webpage. An attacker could plant the link behind a "copy link" button so that the user might not notice the suspicious text at the end of the link until after it is pasted and submitted. These prompt injections could potentially be used to instruct ChatGPT to open a new tab to a malicious website such as a phishing site, or to tell ChatGPT to take harmful actions in the user's integrated applications or logged-in sites like Google Drive, NeuralTrust said.
Last month browser security platform LayerX also described how malicious prompts could be hidden in URLs (as a parameter) for Perplexity's browser Comet. And last week SquareX Labs demonstrated that a malicious browser extension could spoof Comet's AI sidebar feature and have since replicated the proof-of-concept (PoC) attack on Atlas.
But another new vulnerability in ChatGPT Atlas "could allow malicious actors to inject nefarious instructions into the artificial intelligence (AI)-powered assistant's memory and run arbitrary code," reports The Hacker News, citing a report from browser security platform LayerX:
"This exploit can allow attackers to infect systems with malicious code, grant themselves access privileges, or deploy malware," LayerX Security Co-Founder and CEO, Or Eshed, said in a report shared with The Hacker News. The attack, at its core, leverages a cross-site request forgery (CSRF) flaw that could be exploited to inject malicious instructions into ChatGPT's persistent memory. The corrupted memory can then persist across devices and sessions, permitting an attacker to conduct various actions, including seizing control of a user's account, browser, or connected systems, when a logged-in user attempts to use ChatGPT for legitimate purposes....
"What makes this exploit uniquely dangerous is that it targets the AI's persistent memory, not just the browser session," Michelle Levy, head of security research at LayerX Security, said. "By chaining a standard CSRF to a memory write, an attacker can invisibly plant instructions that survive across devices, sessions, and even different browsers. In our tests, once ChatGPT's memory was tainted, subsequent 'normal' prompts could trigger code fetches, privilege escalations, or data exfiltration without tripping meaningful safeguards...."
LayerX said the problem is exacerbated by ChatGPT Atlas' lack of robust anti-phishing controls, the browser security company said, adding it leaves users up to 90% more exposed than traditional browsers like Google Chrome or Microsoft Edge. In tests against over 100 in-the-wild web vulnerabilities and phishing attacks, Edge managed to stop 53% of them, followed by Google Chrome at 47% and Dia at 46%. In contrast, Perplexity's Comet and ChatGPT Atlas stopped only 7% and 5.8% of malicious web pages.
From The Conversation:
Sandboxing is a security approach designed to keep websites isolated and prevent malicious code from accessing data from other tabs. The modern web depends on this separation. But in Atlas, the AI agent isn't malicious code — it's a trusted user with permission to see and act across all sites. This undermines the core principle of browser isolation.
Thanks to Slashdot reader spatwei for suggesting the topic.
Read more of this story at Slashdot.