
LLMjacking — the term that describes scenarios in which attackers use exposed API keys or tokens to hijack Large Language Model (LLM) resources — is a brand-new issue. But although it was first identified in May 2024, it’s already a core challenge for enterprises.
With 58% of companies running LLM-based apps, LLMjacking threatens unexpected costs to the tune of $46K per day, degraded performance, and potential data leaks.
Suddenly, security engineers, DevSecOps teams, and cloud architects must get up to speed on its unique risks and implement safeguards like key management, API monitoring, and anomaly detection. We’ve already talked about related concepts like AI data security and adaptive approaches to Generative AI (Gen AI) threats. Today, let’s make an actionable plan to manage LLMjacking itself.
Understanding LLMjacking: Definition and Attack Fundamentals
Although LLMjacking shares similarities with other cloud-based threats that exploit unpatched vulnerabilities, it has distinct characteristics that security teams must understand to effectively mitigate the risk. Attackers may rely on familiar tactics to identify and exfiltrate cloud credentials and locate accessible LLMs, but they often combine these methods in novel and unexpected ways.
What is LLMjacking? A Clear Definition
LLMjacking refers to the hijacking of an LLM by an attacker using stolen credentials. In essence, it describes unauthorized access to and misuse of an LLM resource. It was first discovered when researchers found attackers using stolen cloud credentials, obtained via a Laravel vulnerability, to access cloud-hosted LLM services with the goal of selling unauthorized access to jailbroken models.
LLMjacking attack vectors used by malicious actors to compromise public-facing LLMs include:
- Credential Theft and Token Exploitation: The unauthorized use of stolen API keys and cloud credentials to run AI models.
- API Abuse and Reverse Proxy Techniques: Reverse proxy techniques (such as OAI reverse proxy) to conceal unauthorized API access behind masked IP addresses and rotating domains.
- Cloud Configuration Vulnerabilities: Flaws and security gaps in cloud services, for example, the Laravel vulnerability (CVE-2021-3129) that resulted in the LLMjacking of a public-facing instance of an Anthropic Claude model.
The reselling component of LLMjacking is a key distinction from earlier AI attacks that focused on prompt injection or data manipulation. LLMjacking incidents leave victims not only unaware of the intrusion, but also footing the bill.
LLM exploitation has become a growing threat. In July 2024 alone, 10X the number of LLM requests to the runtime service, Amazon Bedrock. The increase was corroborated by double the number of unique IPs requesting access over the first half of the year.
With more publicly accessible chatbots and cloud-based AI services than ever, the trend is likely to continue.
LLMjacking vs. Cryptojacking
LLMjacking is distinct from cryptojacking in that it targets LLMs specifically, rather than general compute resources. While both involve the unauthorized use of computing power, cryptojacking installs mining software that strains system resources and is easier to detect.
LLMjacking, on the other hand, leverages stolen API keys or compromised environments to covertly use LLMs for tasks like spam generation or bypassing usage limits. It’s harder to spot, often blending in with normal API traffic — especially in large organizations — making fine-grained monitoring and behavior-based detection more critical.
Runtime Defense Against LLMjacking with Upwind
Upwind monitors workloads at runtime to catch the earliest signs of LLMjacking. By combining real-time detection with deep context on container behavior, Upwind helps security teams isolate abuse, trace the source, and remediate faster than traditional scanning tools.
Spotting LLMjacking: Three Key Signals
While LLMjacking is harder to detect than cryptojacking, since there’s no obvious CPU spike, there are still reliable ways to catch it. Runtime security is key to all of them.
Unusual Outbound API Calls to LLM Providers
Workloads that don’t typically make external requests to AI services may suddenly begin calling domains like api.openai.com or api.anthropic.com.

Credential Misuse Across Environments
A token issued for a dev environment may appear in production or be invoked from a suspicious location, suggesting the credentials were stolen and reused for LLM access.
Behavior that Doesn’t Match the Container’s Role
Containers can begin downloading AI models, issuing prompts, or running code unrelated to their declared function, like inference or scraping content.
Why These 3 Signals Matter
LLMjacking doesn’t leave traditional signs of compromise. There’s no ransomware announcement, no massive CPU load, and no outbound data exfiltration visible in a Security Information and Event Management (SIEM) tool, at least not right away. These 3 signals are key because they’re:
- Observable at runtime, emerging only when workloads or identities are in use. They won’t get caught by static scans or Infrastructure as Code (IaC) reviews.
- Tied to behavioral change, representing a clear deviation from typical workload behavior. That’s a key sign of compromise when the goal is API misuse, not resource theft or data breach.
- Difficult to fake or disguise, with attackers unable to fully obfuscate where they connect, which credentials they use, and the fact that they’ve changed a container’s behavior.
Because of the nature of LLMjacking, a runtime CNAPP with behavioral analytics and network flow visibility is arguably the most capable single solution for detecting LLMjacking today. But it won’t prevent all cases. Here’s what else matters:
Layer | Why it Matters |
Credential Hygiene | Multi-factor authentication (MFA), secret rotation, and limiting the scope of API tokens. |
Billing Anomaly Alerts | Especially in Azure or GCP, which charge per-use for AI services, sudden spikes in AI resource usage can trigger alerts. |
Outbound Traffic Controls | Egress restrictions, from firewall rules to a service mesh or VPC controls, can stop workloads from reaching LLM APIs that they shouldn’t. |
Jailbreak Prompt Filtering | Detecting prompt/jailbreak attempts (via model or proxy layer) adds LLM-specific visibility. |
Detection Strategies for Enterprise Security Teams
Once runtime signals point to possible LLM abuse, like abnormal outbound API calls, credential misuse, and behavioral drift, the next step is to scale detection across environments in order to block persistent abuse and account for the fragmented modern environment. Security teams should combine runtime visibility with traditional detection pipelines to build a layered defense against LLMjacking and AI threats.
Monitoring LLM Usage Patterns and Anomalies
LLMjacking thrives on credential misuse and API abuse that can blend in with legitimate traffic. To detect it, security teams should baseline normal LLM usage, including request volume, model access, and account activity, to flag anomalies.
Key indicators of compromise (IoCs) include sudden API spikes, irregular request patterns, or access from unknown IPs or dormant accounts. Integrating LLM logs with SIEM and cloud monitoring tools improves detection by correlating this behavior across the environment.
Advanced Threat Detection Techniques
Runtime tools provide a strong foundation, but detecting more persistent abuse calls for behavioral and API-level analysis. Think about adding:
- Behavioral analysis to spot anomalies like repeated, slightly varied queries common in abuse or spam
- Script-based anomaly detection, like unusual request headers, timing irregularities, and malformed payloads.
- Cost-based alerting, especially in platforms like Azure or GCP, where unauthorized usage may first appear as a billing anomaly.
Response Playbooks for Security Operations
Detection is only the beginning. Effective incident response should be specifically designed for credential abuse, where cost, compliance, and abuse risks all converge.
The following elements are key to building a thorough and actionable response framework.
Activity | Actions |
Immediate Containment | Rotate exposed API keys or credentials. Throttle or block suspicious traffic. Isolate affected workloads or environments |
Root Cause Analysis | Correlate runtime and SIEM logs. Investigate how access was obtained. Identify shadow IT deployments or misconfigured access policies |
Forensics and Evidence Collection | Secure logs. Preserve session data and access records. dentify other forensics and audit trails for supporting future investigations or legal action. |
Remediation | Patch vulnerabilities. Tighten IAM and outbound egress controls. Update detection rules and runtime thresholds |
Together, these detection and response strategies build on runtime signals that can point to first indicators of abuse, and make sure teams can scale detection, trace activity across environments, and act quickly. It all helps teams move beyond isolated alerts to a cohesive strategy for managing LLMjacking risk at enterprise scale.
The Final Step: Minimizing the Attack Surface
With detection and response strategies set, the final step is to shrink the space where LLMjacking can happen in the first place. That means minimizing the attack surface, for fewer ways that attackers can access, abuse, and escalate the use of Gen AI in the environment. Prevention at this stage isn’t about a single control. Instead, look to create a layered approach rooted in governance, access control, and visibility.
Enforce Credential Hygiene and Scope Limitation
LLMjacking succeeds because credentials work. The fewer valid keys attackers can use, and the more narrowly scoped they are, the harder abuse becomes.
- Rotate API keys regularly and expire old or unused tokens
- Use short-lived, workload-bound tokens via cloud-native IAM
- Apply least-privilege principles, restricting model access to only approved users and workloads
- Monitor where and how credentials are used to catch drift and lateral reuse
Govern LLM Usage Across Environments
Establish internal policies that govern who can use LLMs, which models get approved, and how those interactions are tracked.
- Require tagging or registration of any workload using LLM APIs
- Distinguish between development, production, and testing environments in terms of allowable LLM usage
- Make sure LLM access policies align with your broader AI governance framework or risk management policies
Enforce Policy Through Infrastructure
Use preventative tooling to enforce security posture by design.
- Apply IaC policies to block hardcoded API keys or unapproved egress
- Leverage CNAPP or CSPM to detect misconfigurations that expose credentials or LLM access
- Create outbound network policies that restrict connections to only explicitly approved AI service endpoints
Maintain Centralized Visibility
Security and compliance can’t secure what they can’t see, so unify dashboards and alerts around LLM usage.
- Monitor which identities, workloads, and regions are making LLM calls
- Track changes in volume and cost across clouds
- Score risk based on behavioral analysis, accessibility, and data sensitivity
Minimizing attack surface reinforces detection, making LLMjacking more difficult to pull off. In combination with runtime detection and smart incident response, these controls help shift AI security from reactive containment to proactive management.
Real World Scenarios and Future-Proofing Against LLMjacking Threats
Recently, Chinese AI startup DeepSeek made headlines for its advanced Gen AI models (such as its DeepSeek-R1 reasoning model) that compete with top-tier systems like OpenAI’s GPT-4o in both performance and efficiency.
However, just as the DeepSeek AI assistant app was surging in popularity, the company was forced to temporarily halt new user registrations after a large-scale cyberattack targeted its systems. Cybercriminals acquired DeepSeek API keys, then used them to build unauthorized reverse proxy services. These proxies allowed other people to access DeekSeek’s LLM without going through the company’s own channels.
With the increasing use of sophisticated threats like reverse proxies, how can teams safeguard themselves and their defense systems from LLMjacking?
- Prevent API Key Leakage at Scale
Enforce key-scoping and IP restrictions to bind API keys to specific services and environments. Rotate keys frequently and use short-lived tokens or workload-bound identities, as via AWS STS or Azure Managed Identity.
- Jailbreak Automation and Model Abuse
Once behind a reverse proxy, users were free to prompt Deek Seek’s models without rate limits or content safeguards. Attackers tested and refined jailbreak prompts to produce phishing templates, malware code, and any other harmful outputs they desired. At scale. Deploy LLM-aware security tools to see and filter prompt and response behavior. Rate limit sensitive prompt categories, like code generation. Integrate output moderation and prompt fingerprinting into the API layer, even for internal use.
- Protect Against Silent Abuse
Stolen access generates real charges, so set cost ceilings and billing alerts for key cloud platforms. Enable anomaly detection via CNAPP to identify unexpected LLM use spikes. Isolate LLM workloads in separate billing projects for visibility and containment.
As Gen AI grows, so does its appeal to attackers. Runtime visibility, along with credential security and policy enforcement, is also evolving to include LLMjacking-specific risks. By treating models like secure assets, companies can future-proof themselves against the next wave of prompt abuse and unauthorized access.
Upwind Protects Runtimes Against LLMjacking
Upwind helps security teams detect and stop LLMjacking in real time by monitoring runtime behavior, outbound traffic, and credential use in production. Whether it’s a container suddenly calling a Gen AI model or a stolen token used across environments, Upwind surfaces behavioral and identity anomalies that signal LLM abuse before it turns into expensive unauthorized use.
With deep identity context and automated response actions, Upwind doesn’t just alert teams to potential abuse. It enables teams to isolate abuse, trace root cause, and minimize AI-related risk across the cloud environment
To see how runtime-powered security can close the gaps that traditional tools miss, schedule a demo.
Frequently Asked Questions
What distinguishes LLMjacking from other cloud security threats?
LLMjacking stands apart from typical cloud threats by targeting the compute and access patterns of LLMs. Unlike cryptojacking, which hijacks hardware for mining, LLMjacking abuses exposed API keys or misconfigurations to covertly exploit AI services. The dynamic and distributed nature of LLMjacking makes detection harder, requiring behavior-based monitoring instead of traditional security signatures.
Which industries and AI use cases are most targeted by LLMjacking attacks?
Industries heavily invested in AI, like finance, healthcare, tech, and e-commerce, are top targets for LLMjacking. Attackers focus on environments where LLMs power customer-facing apps or automation tools, looking for chatbots, content generation, and code assistance to help conceal usage and maximize their chances that credential misuse goes unnoticed.
Organizations managing sensitive data through LLMs face even higher stakes, as their models are attractive to both opportunistic hackers and Advanced Persistent Threats (APTs).
How quickly can organizations typically detect unauthorized LLM usage?
LLMjacking detection often takes days or weeks, especially when attackers mimic normal usage. Without detailed logging, API monitoring, or cost alerts, many teams only catch it after billing spikes or after wrestling with performance impacts. Organizations with behavioral analytics and LLM-aware SIEM tools can shorten detection to hours, reducing impact and dwell time. Some key game-changers include:
- Runtime CNAPP deployment
- SIEM or billing alerts
- Credential hygiene and scoping
- Monitoring tools with LLM-specific observability or API-level tracking
What immediate steps should security teams take after detecting a potential LLMjacking incident?
When LLMjacking is detected, revoke exposed credentials immediately and isolate affected systems. Collect forensic data like API logs, access patterns, and model history to trace the attack vector and assess impact. Remediate vulnerabilities, review resource usage and data exposure, then update monitoring and key management. Finish with a post-incident review to improve defenses.