Prompt Injection Flaw

Cybersecurity researchers have disclosed a high-severity security flaw in the Vanna.AI library that could be exploited to achieve remote code execution vulnerability via prompt injection techniques.

The vulnerability, tracked as CVE-2024-5565 (CVSS score: 8.1), relates to a case of prompt injection in the “ask” function that could be exploited to trick the library into executing arbitrary commands, supply chain security firm JFrog said.

Vanna is a Python-based machine learning library that allows users to chat with their SQL database to glean insights by “just asking questions” (aka prompts) that are translated into an equivalent SQL query using a large language model (LLM).

The rapid rollout of generative artificial intelligence (AI) models in recent years has brought to the fore the risks of exploitation by malicious actors, who can weaponize the tools by providing adversarial inputs that bypass the safety mechanisms built into them.

One such prominent class of attacks is prompt injection, which refers to a type of AI jailbreak that can be used to disregard guardrails erected by LLM providers to prevent the production of offensive, harmful, or illegal content, or carry out instructions that violate the intended purpose of the application.

Cybersecurity

Such attacks can be indirect, wherein a system processes data controlled by a third party (e.g., incoming emails or editable documents) to launch a malicious payload that leads to an AI jailbreak.

They can also take the form of what’s called a many-shot jailbreak or multi-turn jailbreak (aka Crescendo) in which the operator “starts with harmless dialogue and progressively steers the conversation toward the intended, prohibited objective.”

This approach can be extended further to pull off another novel jailbreak attack known as Skeleton Key.

“This AI jailbreak technique works by using a multi-turn (or multiple step) strategy to cause a model to ignore its guardrails,” Mark Russinovich, chief technology officer of Microsoft Azure, said. “Once guardrails are ignored, a model will not be able to determine malicious or unsanctioned requests from any other.”

Skeleton Key is also different from Crescendo in that once the jailbreak is successful and the system rules are changed, the model can create responses to questions that would otherwise be forbidden regardless of the ethical and safety risks involved.

“When the Skeleton Key jailbreak is successful, a model acknowledges that it has updated its guidelines and will subsequently comply with instructions to produce any content, no matter how much it violates its original responsible AI guidelines,” Russinovich said.

Prompt Injection Flaw

“Unlike other jailbreaks like Crescendo, where models must be asked about tasks indirectly or with encodings, Skeleton Key puts the models in a mode where a user can directly request tasks. Further, the model’s output appears to be completely unfiltered and reveals the extent of a model’s knowledge or ability to produce the requested content.”

The latest findings from JFrog – also independently disclosed by Tong Liu – show how prompt injections could have severe impacts, particularly when they are tied to command execution.

CVE-2024-5565 takes advantage of the fact that Vanna facilitates text-to-SQL Generation to create SQL queries, which are then executed and graphically presented to the users using the Plotly graphing library.

This is accomplished by means of an “ask” function – e.g., vn.ask(“What are the top 10 customers by sales?”) – which is one of the main API endpoints that enables the generation of SQL queries to be run on the database.

Cybersecurity

The aforementioned behavior, coupled with the dynamic generation of the Plotly code, creates a security hole that allows a threat actor to submit a specially crafted prompt embedding a command to be executed on the underlying system.

“The Vanna library uses a prompt function to present the user with visualized results, it is possible to alter the prompt using prompt injection and run arbitrary Python code instead of the intended visualization code,” JFrog said.

“Specifically, allowing external input to the library’s ‘ask’ method with ‘visualize’ set to True (default behavior) leads to remote code execution.”

Following responsible disclosure, Vanna has issued a hardening guide that warns users that the Plotly integration could be used to generate arbitrary Python code and that users exposing this function should do so in a sandboxed environment.

“This discovery demonstrates that the risks of widespread use of GenAI/LLMs without proper governance and security can have drastic implications for organizations,” Shachar Menashe, senior director of security research at JFrog, said in a statement.

“The dangers of prompt injection are still not widely well known, but they are easy to execute. Companies should not rely on pre-prompting as an infallible defense mechanism and should employ more robust mechanisms when interfacing LLMs with critical resources such as databases or dynamic code generation.”

Found this article interesting? Follow us on Twitter and LinkedIn to read more exclusive content we post.

Source: thehackernews.com/