New Study Uncovers Text-to-SQL Model Vulnerabilities Allowing Data Theft and DoS Attacks

Jan 09, 2023Ravie LakshmananDatabase Security / PLM Framework

A group of academics has demonstrated novel attacks that leverage Text-to-SQL models to produce malicious code that could enable adversaries to glean sensitive information and stage denial-of-service (DoS) attacks.

“To better interact with users, a wide range of database applications employ AI techniques that can translate human questions into SQL queries (namely Text-to-SQL),” Xutan Peng, a researcher at the University of Sheffield, told The Hacker News.

“We found that by asking some specially designed questions, crackers can fool Text-to-SQL models to produce malicious code. As such code is automatically executed on the database, the consequence can be pretty severe (e.g., data breaches and DoS attacks).”

The findings, which were validated against two commercial solutions BAIDU-UNIT and AI2sql, mark the first empirical instance where natural language processing (NLP) models have been exploited as an attack vector in the wild.

The black box attacks are analogous to SQL injection faults wherein embedding a rogue payload in the input question gets copied to the constructed SQL query, leading to unexpected results.

The specially crafted payloads, the study discovered, could be weaponized to run malicious SQL queries that could permit an attacker to modify backend databases and carry out DoS attacks against the server.

Furthermore, a second category of attacks explored the possibility of corrupting various pre-trained language models (PLMs) – models that have been trained with a large dataset while remaining agnostic to the use cases they are applied on – to trigger the generation of malicious commands based on certain triggers.

“There are many ways of planting backdoors in PLM-based frameworks by poisoning the training samples, such as making word substitutions, designing special prompts, and altering sentence styles,” the researchers explained.

The backdoor attacks on four different open source models (BART-BASE, BART-LARGE, T5-BASE, and T5-3B) using a corpus poisoned with malicious samples achieved a 100% success rate with little discernible impact on performance, making such issues difficult to detect in the real world.

As mitigations, the researchers suggest incorporating classifiers to check for suspicious strings in inputs, assessing off-the-shelf models to prevent supply chain threats, and adhering to good software engineering practices.

Found this article interesting? Follow us on Twitter and LinkedIn to read more exclusive content we post.

Source: thehackernews.com/

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

New Study Uncovers Text-to-SQL Model Vulnerabilities Allowing Data Theft and DoS Attacks

Bythehackernews.com

Related posts:

UnitedHealth confirms Optum hack behind US healthcare billing outage

Everything Apple revealed at the iPhone 16 launch event: Apple Watch Series 10, AirPods 4, iOS 18 an...

Log4j Attacks Continue Unabated Against VMware Horizon Servers

Canadian Man Charged in $65M Cryptocurrency Hacking Schemes

Ultimate Ears Everboom review: A floatable outdoor speaker that packs a punch

You missed

Supreme Court sides with Trump to temporarily allow DEI cuts of $65 million

Male allegedly threatens to kill homeowner, bangs on door after midnight. But victim isn’t playing — and shoots through door.

Pro-abortion woman violently attacks pro-life reporter during recorded street interview

IRS cutting its workforce by 25%, eliminating agency’s civil rights office

Shackle Media