Upwind raises $250M Series B to secure the cloud for the world →
Get a Demo

LLMjacking — the term that describes scenarios in which attackers use exposed API keys or tokens to hijack Large Language Model (LLM) resources — is a brand-new issue. But although it was first identified in May 2024, it’s already a core challenge for enterprises.

 With 58% of companies running LLM-based apps, LLMjacking threatens unexpected costs to the tune of $46K per day, degraded performance, and potential data leaks.

Suddenly, security engineers, DevSecOps teams, and cloud architects must get up to speed on its unique risks and implement safeguards like key management, API monitoring, and anomaly detection. We’ve already talked about related concepts like AI data security and adaptive approaches to Generative AI (Gen AI) threats. Today, let’s make an actionable plan to manage LLMjacking itself.

Understanding LLMjacking: Definition and Attack Fundamentals

Although LLMjacking shares similarities with other cloud-based threats that exploit unpatched vulnerabilities, it has distinct characteristics that security teams must understand to effectively mitigate the risk. Attackers may rely on familiar tactics to identify and exfiltrate cloud credentials and locate accessible LLMs, but they often combine these methods in novel and unexpected ways. 

What is LLMjacking? A Clear Definition

LLMjacking refers to the hijacking of an LLM by an attacker using stolen credentials. In essence, it describes unauthorized access to and misuse of an LLM resource. It was first discovered when researchers found attackers using stolen cloud credentials, obtained via a Laravel vulnerability, to access cloud-hosted LLM services with the goal of selling unauthorized access to jailbroken models. 

LLMjacking attack vectors used by malicious actors to compromise public-facing LLMs include:

The reselling component of LLMjacking is a key distinction from earlier AI attacks that focused on prompt injection or data manipulation. LLMjacking incidents leave victims not only unaware of the intrusion, but also footing the bill.

LLM exploitation has become a growing threat. In July 2024 alone, 10X the number of LLM requests to the runtime service, Amazon Bedrock. The increase was corroborated by double the number of unique IPs requesting access over the first half of the year.

With more publicly accessible chatbots and cloud-based AI services than ever, the trend is likely to continue.

LLMjacking vs. Cryptojacking 

LLMjacking is distinct from cryptojacking in that it targets LLMs specifically, rather than general compute resources. While both involve the unauthorized use of computing power, cryptojacking installs mining software that strains system resources and is easier to detect.

LLMjacking, on the other hand, leverages stolen API keys or compromised environments to covertly use LLMs for tasks like spam generation or bypassing usage limits. It’s harder to spot, often blending in with normal API traffic — especially in large organizations — making fine-grained monitoring and behavior-based detection more critical.

Runtime Defense Against LLMjacking with Upwind

Upwind monitors workloads at runtime to catch the earliest signs of LLMjacking. By combining real-time detection with deep context on container behavior, Upwind helps security teams isolate abuse, trace the source, and remediate faster than traditional scanning tools.

Get a Demo

Spotting LLMjacking: Three Key Signals

While LLMjacking is harder to detect than cryptojacking, since there’s no obvious CPU spike, there are still reliable ways to catch it. Runtime security is key to all of them.

Unusual Outbound API Calls to LLM Providers

Workloads that don’t typically make external requests to AI services may suddenly begin calling domains like api.openai.com or api.anthropic.com.

A runtime CNAPP showing a previously quiet container initiating outbound calls to APIs should be flagged as behavioral drift.
A runtime CNAPP showing a previously quiet container initiating outbound calls to APIs should be flagged as behavioral drift.

Credential Misuse Across Environments

A token issued for a dev environment may appear in production or be invoked from a suspicious location, suggesting the credentials were stolen and reused for LLM access.

Behavior that Doesn’t Match the Container’s Role

Containers can begin downloading AI models, issuing prompts, or running code unrelated to their declared function, like inference or scraping content.

Why These 3 Signals Matter

LLMjacking doesn’t leave traditional signs of compromise. There’s no ransomware announcement, no massive CPU load, and no outbound data exfiltration visible in a Security Information and Event Management (SIEM) tool, at least not right away. These 3 signals are key because they’re:

Because of the nature of LLMjacking, a runtime CNAPP with behavioral analytics and network flow visibility is arguably the most capable single solution for detecting LLMjacking today. But it won’t prevent all cases. Here’s what else matters:

LayerWhy it Matters
Credential HygieneMulti-factor authentication (MFA), secret rotation, and limiting the scope of API tokens.
Billing Anomaly AlertsEspecially in Azure or GCP, which charge per-use for AI services, sudden spikes in AI resource usage can trigger alerts.
Outbound Traffic ControlsEgress restrictions, from firewall rules to a service mesh or VPC controls, can stop workloads from reaching LLM APIs that they shouldn’t.
Jailbreak Prompt FilteringDetecting prompt/jailbreak attempts (via model or proxy layer) adds LLM-specific visibility.

Detection Strategies for Enterprise Security Teams

Once runtime signals point to possible LLM abuse, like abnormal outbound API calls, credential misuse, and behavioral drift, the next step is to scale detection across environments in order to block persistent abuse and account for the fragmented modern environment. Security teams should combine runtime visibility with traditional detection pipelines to build a layered defense against LLMjacking and AI threats.

Monitoring LLM Usage Patterns and Anomalies

LLMjacking thrives on credential misuse and API abuse that can blend in with legitimate traffic. To detect it, security teams should baseline normal LLM usage, including request volume, model access, and account activity, to flag anomalies. 

Key indicators of compromise (IoCs) include sudden API spikes, irregular request patterns, or access from unknown IPs or dormant accounts. Integrating LLM logs with SIEM and cloud monitoring tools improves detection by correlating this behavior across the environment.

Advanced Threat Detection Techniques

Runtime tools provide a strong foundation, but detecting more persistent abuse calls for behavioral and API-level analysis. Think about adding:

Response Playbooks for Security Operations

Detection is only the beginning. Effective incident response should be specifically designed for credential abuse, where cost, compliance, and abuse risks all converge.

The following elements are key to building a thorough and actionable response framework.

Activity Actions
Immediate ContainmentRotate exposed API keys or credentials. Throttle or block suspicious traffic. Isolate affected workloads or environments
Root Cause AnalysisCorrelate runtime and SIEM logs. Investigate how access was obtained. Identify shadow IT deployments or misconfigured access policies
Forensics and Evidence CollectionSecure logs. Preserve session data and access records. dentify other forensics and audit trails for supporting future investigations or legal action.
RemediationPatch vulnerabilities. Tighten IAM and outbound egress controls. Update detection rules and runtime thresholds

Together, these detection and response strategies build on runtime signals that can point to first indicators of abuse, and make sure teams can scale detection, trace activity across environments, and act quickly. It all helps teams move beyond isolated alerts to a cohesive strategy for managing LLMjacking risk at enterprise scale.

The Final Step: Minimizing the Attack Surface

With detection and response strategies set, the final step is to shrink the space where LLMjacking can happen in the first place. That means minimizing the attack surface, for fewer ways that attackers can access, abuse, and escalate the use of Gen AI in the environment. Prevention at this stage isn’t about a single control. Instead, look to create a layered approach rooted in governance, access control, and visibility.

Enforce Credential Hygiene and Scope Limitation

LLMjacking succeeds because credentials work. The fewer valid keys attackers can use, and the more narrowly scoped they are, the harder abuse becomes. 

Govern LLM Usage Across Environments

Establish internal policies that govern who can use LLMs, which models get approved, and how those interactions are tracked.

Enforce Policy Through Infrastructure

Use preventative tooling to enforce security posture by design.

Maintain Centralized Visibility

Security and compliance can’t secure what they can’t see, so unify dashboards and alerts around LLM usage.

Minimizing attack surface reinforces detection, making LLMjacking more difficult to pull off. In combination with runtime detection and smart incident response, these controls help shift AI security from reactive containment to proactive management.

Real World Scenarios and Future-Proofing Against LLMjacking Threats

Recently, Chinese AI startup DeepSeek made headlines for its advanced Gen AI models (such as its DeepSeek-R1 reasoning model) that compete with top-tier systems like OpenAI’s GPT-4o in both performance and efficiency.

However, just as the DeepSeek AI assistant app was surging in popularity, the company was forced to temporarily halt new user registrations after a large-scale cyberattack targeted its systems. Cybercriminals acquired DeepSeek API keys, then used them to build unauthorized reverse proxy services. These proxies allowed other people to access DeekSeek’s LLM without going through the company’s own channels. 

With the increasing use of sophisticated threats like reverse proxies, how can teams safeguard themselves and their defense systems from LLMjacking?

  1. Prevent API Key Leakage at Scale

Enforce key-scoping and IP restrictions to bind API keys to specific services and environments. Rotate keys frequently and use short-lived tokens or workload-bound identities, as via AWS STS or Azure Managed Identity.

  1. Jailbreak Automation and Model Abuse

Once behind a reverse proxy, users were free to prompt Deek Seek’s models without rate limits or content safeguards. Attackers tested and refined jailbreak prompts to produce phishing templates, malware code, and any other harmful outputs they desired. At scale. Deploy LLM-aware security tools to see and filter prompt and response behavior. Rate limit sensitive prompt categories, like code generation. Integrate output moderation and prompt fingerprinting into the API layer, even for internal use.

  1. Protect Against Silent Abuse

Stolen access generates real charges, so set cost ceilings and billing alerts for key cloud platforms. Enable anomaly detection via CNAPP to identify unexpected LLM use spikes. Isolate LLM workloads in separate billing projects for visibility and containment.

As Gen AI grows, so does its appeal to attackers. Runtime visibility, along with credential security and policy enforcement, is also evolving to include LLMjacking-specific risks. By treating models like secure assets, companies can future-proof themselves against the next wave of prompt abuse and unauthorized access.

Upwind Protects Runtimes Against LLMjacking

Upwind helps security teams detect and stop LLMjacking in real time by monitoring runtime behavior, outbound traffic, and credential use in production. Whether it’s a container suddenly calling a Gen AI model or a stolen token used across environments, Upwind surfaces behavioral and identity anomalies that signal LLM abuse before it turns into expensive unauthorized use.

With deep identity context and automated response actions, Upwind doesn’t just alert teams to potential abuse. It enables teams to isolate abuse, trace root cause, and minimize AI-related risk across the cloud environment 

To see how runtime-powered security can close the gaps that traditional tools miss, schedule a demo.

Frequently Asked Questions

What distinguishes LLMjacking from other cloud security threats?

LLMjacking stands apart from typical cloud threats by targeting the compute and access patterns of LLMs. Unlike cryptojacking, which hijacks hardware for mining, LLMjacking abuses exposed API keys or misconfigurations to covertly exploit AI services. The dynamic and distributed nature of LLMjacking makes detection harder, requiring behavior-based monitoring instead of traditional security signatures.

Which industries and AI use cases are most targeted by LLMjacking attacks?

Industries heavily invested in AI, like finance, healthcare, tech, and e-commerce, are top targets for LLMjacking. Attackers focus on environments where LLMs power customer-facing apps or automation tools, looking for chatbots, content generation, and code assistance to help conceal usage and maximize their chances that credential misuse goes unnoticed. 

Organizations managing sensitive data through LLMs face even higher stakes, as their models are attractive to both opportunistic hackers and Advanced Persistent Threats (APTs).

How quickly can organizations typically detect unauthorized LLM usage?

LLMjacking detection often takes days or weeks, especially when attackers mimic normal usage. Without detailed logging, API monitoring, or cost alerts, many teams only catch it after billing spikes or after wrestling with performance impacts. Organizations with behavioral analytics and LLM-aware SIEM tools can shorten detection to hours, reducing impact and dwell time. Some key game-changers include:

What immediate steps should security teams take after detecting a potential LLMjacking incident?

When LLMjacking is detected, revoke exposed credentials immediately and isolate affected systems. Collect forensic data like API logs, access patterns, and model history to trace the attack vector and assess impact. Remediate vulnerabilities, review resource usage and data exposure, then update monitoring and key management. Finish with a post-incident review to improve defenses.