Many researchers warn that generative artificial intelligence worms infiltrating the world's systems pose a major security risk (Al Jazeera)

In recent years, the field of artificial intelligence has witnessed significant developments, with the emergence of new applications and use cases, as generative artificial intelligence systems, such as “GPT Chat” from OpenAI, and “Gemini” from Google, have evolved into indispensable tools. about it in various sectors.

To keep pace with these developments, startups and technology companies are harnessing their capabilities to develop artificial intelligence agents and ecosystems capable of automating tasks, from scheduling appointments to purchasing products.

In parallel with these developments, these systems are gaining more independence and becoming vulnerable to various forms of risks and electronic attacks.

One such risk is the development of AI-powered malware, which can spread on its own, posing a major threat to cybersecurity.

Coming

In the midst of the fast-moving AI landscape, a team of researchers has designed what they claim are among the first “generative AI worms.” These worms have the ability to spread from one system to another, potentially stealing data or spreading malware in their wake.

“This basically means that you now have the ability to carry out a new type of cyber attack that has never been seen before,” explains Cornell Tech researcher Ben Nasi, one of the computer architects behind this research endeavour.

Nasi, along with fellow researchers Stav Cohen and Ron Peyton, named their creation Morris2, a nod to the infamous Morris computer worm that wreaked havoc online in 1988.

In a comprehensive paper and accompanying website shared exclusively with WIRED, researchers reveal how an AI worm could infiltrate an AI-powered email assistant to steal data from emails, spread spam, and bypass certain procedures. Security in GPT Chat and Gemini during this process.

This research, conducted within controlled testing environments and not within publicly available email assistant applications, coincides with the development of large linguistic models (LLMs) into multimedia entities, capable of generating images and videos in addition to text.

While cases of generative AI worms infiltrating real-world systems have yet to emerge, many researchers warn that they pose a significant security risk that requires the attention of startups, developers and technology companies, alike.

Typically, generative AI systems work by responding to prompts, which are text instructions that prompt the system to answer a query or generate content. However, these commands can be manipulated to subvert system operations.

More dangerously, jailbreaking operations can force the system to ignore security protocols, resulting in the production of malicious or hateful content, or the use of a cyber attack known as an injection attack, the danger of which is that it directs instructions to the chatbot. Invisibly.

To engineer the generative AI worm, the researchers used what they call an “hostile self-replication trigger.”

This prompts the generative AI model to generate another command in its response, essentially directing the AI ​​system to produce a series of subsequent instructions in its responses. The attack appears to be enveloped in another attack. This attack is similar to SQL injection and buffer overflow attacks.

As for clarifying the worm's function, the researchers created an email system capable of two-way communication using generative artificial intelligence, and integration with GPT Chat, Gemini, and the open source LLM LLaVA.

Well, imagine you have an email assistant, like a super smart assistant that can read your emails and respond to them for you. Now, researchers have found a way to trick this email assistant into carrying out a malicious attack.

They sent him a special type of message called a "hostile text command," which is like a secret code that messes with the assistant's mind.

This special message is designed to make the assistant act strange and do things he's not supposed to do.

To engineer the generative AI worm, the researchers used what they call an “hostile self-replication trigger” (Getty)

Built into its design, Assistant uses what's called an "augmented retrieval generator" to help it respond to emails. This means it looks at other information to come up with the best response, but when it gets infected, it starts providing private information like your credit card number from emails instead of helpful responses.

This is where it gets really tricky: when the infected assistant replies to an email, he or she is sending this private information to someone else. If that person's email assistant is also infected, the cycle continues.

It's like a chain reaction, spreading from one email assistant to another, stealing private data along the way. So, in simple terms, researchers have found a way to make an email assistant act like a spy, stealing secrets from emails and passing them on to other email assistants.

In the second method, the researchers confirm that the image embedded with a malicious command prompts the email assistant to forward the message to additional recipients, as Nasi explained that: “By encrypting the self-repeating commands in the image, any type of image that contains malicious messages can be forwarded.” Unsolicited material, abuse material, or even advertising to new customers after the initial email has been sent.”

Consequences of bad design

Although the research circumvents some of the security measures in GBT and Gemini, the researchers stress that their work serves as a cautionary tale regarding “bad architectural design” within the broader AI ecosystem.

However, they immediately shared their findings with Google and OpenAI.

An OpenAI spokesperson acknowledged: “It appears they have found a way to exploit injection-type vulnerabilities by relying on unvetted or filtered user input,” noting that the company is actively working to enhance the resiliency of its systems. He also urged developers to use methods to ensure that they do not deal with malicious input, according to the Wired report.

The researchers immediately shared their findings with Google and OpenAI (Shutterstock)

Google declined to comment on the research, although messages Nasi shared with Wired indicate the company's researchers are seeking a meeting to discuss the matter. While the worm was demonstrated in a carefully controlled environment, many security experts who examined the research stress the imminent threat posed by generative AI worms, which is a very serious problem.

In recent research, security experts from Singapore and China demonstrated that they were able to jailbreak 1 million clients of large language modeling applications in less than 5 minutes.

Sahar Abdelnaby, a researcher at CISPA's Helmholtz Center for Information Security in Germany, who contributed to some of the initial demonstrations of rapid injection against large language models in May 2023 and highlighted the feasibility of worms. When AI models ingest data from external sources or when AI agents act independently, the risk of worm spread becomes tangible.

For her part, Sahar Abdel Nabi confirms: “I think the idea of ​​spreading the injections is very reasonable.” It all depends on the type of applications in which these models are used. The researcher expects that although such attacks are currently being simulated, they may move from theory to practice in a short period.

In a paper detailing their findings, Nasi and his colleagues predict that generative AI worms will appear in the wild within the next two or three years. The paper posits that “environmental generative AI systems are undergoing massive development by many companies in the industry that are integrating their capabilities into their cars and phones.” smartphones and their operating systems.

To confront the looming threat, creators of generative AI systems must have ways to strengthen their defenses against mobile malware even though it is likely and uncertain, including implementing traditional security methodologies and devising new ones.

Source: websites