
Part 1: The Many Ways LLMs Leak Data—and How to Solve It
By Vikram Venkat, Principal at Cota Capital
It’s no secret that AI adoption by enterprises has grown rapidly over the last couple of years. A recent McKinsey survey found that 71% of enterprises have started using generative AI in at least one business function, and the number is likely to continue growing.
This rapid adoption of AI in the workplace comes with clear productivity benefits. But it also comes with massive cybersecurity risks around data leakage. Enterprise AI tools often have access to confidential company and customer data, which, if breached, can lead to significant monetary and reputational losses for those impacted. Additionally, AI security is often a shared responsibility between the enterprise user and the model provider, which adds an additional layer of vulnerability, as neither party has complete visibility of the security perimeter.
What’s more, with the advent of agentic AI, many AI platforms now integrate into various other enterprise software systems – implying that a breach could now lead to data across multiple elements of an enterprise’s stack being compromised. Finally, shadow IT (when enterprise users leverage software/tech platforms for business purposes without the official approval of the enterprise IT team) adds another layer of risk, and is especially prominent in the case of AI agents and assistants.
In the first installment of this two-part series, we examine why AI-powered data leaks have become the top concern for enterprise security teams, surpassing even hallucinations and ethical risks. We’ll analyze the structural vulnerabilities of LLMs in enterprise environments and introduce the three emerging attack vectors that exploit them: prompt injection, jailbreaking, and flowbreaking. Part 2 will provide an in-depth breakdown of these attack methods and their countermeasures.
A new world of risks
Enterprise cybersecurity teams are already stretched thin. In fact, a recent BCG report found that only 72% of cybersecurity roles are filled, with especially large shortfalls in critical industries such as financial services and healthcare. The shortage of skilled workers is especially pronounced when it comes to AI talent, with security teams needing to constantly adapt to new tools, protocols, use cases, and architectures.
Making matters worse, traditional cybersecurity platforms are ill-equipped to handle the more conversational inputs and longer context windows that are typical of AI platforms. And while they may be able to secure specific applications, they are often unable to secure external models that are embedded or called from within these applications.
CISOs are clearly worried about the risks associated with this new paradigm. McKinsey’s Cyber Market Survey found that CISOs are eager to adopt capabilities around PII/sensitive data scanning and leak protection as one of their top 3 priorities. Additionally, the Open Worldwide Application Security Project (OWASP) rated prompt injection as their top risk in 2025, and included four different data leakage risks within their top 10 LLM and generative AI risks for 2025.
Data leaks emerge as AI’s greatest threat
All of this has led to a surprising finding – the biggest risk enterprises are worrying about during AI implementations is data security, and not hallucinations, ethical considerations around data (including copyright infringement, biases, etc.), or any other such consideration. Further, 45% of enterprises surveyed in 2024 suffered from data leakage.
The highest-profile incident that has been attributed to improper LLM usage is the leakage of confidential data at Samsung. Employees input sensitive source code and confidential meeting notes into ChatGPT, which then becomes accessible to OpenAI and potentially other users.
The incident was discovered and reported by The Economist, and led to Samsung eventually banning ChatGPT use internally. While it was caused by human error, it led to massive reputational loss, as well as potential financial losses if that data could be accessed externally.
There have also been multiple recent incidents where confidential user data (email addresses, passwords, contact numbers) was allegedly stolen from DeepSeek (where over a million customer records were compromised) and OmniGPT (where over 34 million messages, including passwords, API keys, and uploaded files were leaked). Researchers and ethical hackers have also demonstrated multiple vulnerabilities across the leading AI model providers, including OpenAI, Anthropic, Meta, Microsoft, and more.
How to misguide an AI
But what exactly are these new forms of attack, and how are they being executed? There are three main types of attacks that have been identified by researchers and cybersecurity practitioners. All three types of attack aim to deceive models such that they conduct actions that would circumvent their inbuilt guardrails.
- Prompt injection: Attacks that mix malicious and non-malicious inputs with the aim of executing unsafe actions through the applications built on top of LLMs
- Jailbreaking: Attacks that attempt to bypass LLMs’ safety mechanisms with the aim of generating inappropriate or restricted content as an output
- Flowbreaking: Attacks that attempt to prevent LLMs from retracting inappropriate or restricted content by interrupting their process
The first two attacks were named by programmer Simon Willison, drawing on similarities to SQL injection and jailbreaking of devices; the last was named by researchers at startup Knostic. In Part 2 of this series, we’ll dissect exactly how prompt injection, jailbreaking, and flowbreaking attacks work—including real-world examples—and explore the next-generation solutions being developed to stop them.