LLMjacking: What is it? And Why is it a Concern?

LLMjacking — the term that describes scenarios in which attackers use exposed API keys or tokens to hijack Large Language Model (LLM) resources — is a brand-new issue. But although it was first identified in May 2024, it’s already a core challenge for enterprises.

With 58% of companies running LLM-based apps, LLMjacking threatens unexpected costs to the tune of $46K per day, degraded performance, and potential data leaks.

Suddenly, security engineers, DevSecOps teams, and cloud architects must get up to speed on its unique risks and implement safeguards like key management, API monitoring, and anomaly detection. We’ve already talked about related concepts like AI data security and adaptive approaches to Generative AI (Gen AI) threats. Today, let’s make an actionable plan to manage LLMjacking itself.

Understanding LLMjacking: Definition and Attack Fundamentals

Although LLMjacking shares similarities with other cloud-based threats that exploit unpatched vulnerabilities, it has distinct characteristics that security teams must understand to effectively mitigate the risk. Attackers may rely on familiar tactics to identify and exfiltrate cloud credentials and locate accessible LLMs, but they often combine these methods in novel and unexpected ways.

What is LLMjacking? A Clear Definition

LLMjacking refers to the hijacking of an LLM by an attacker using stolen credentials. In essence, it describes unauthorized access to and misuse of an LLM resource. It was first discovered when researchers found attackers using stolen cloud credentials, obtained via a Laravel vulnerability, to access cloud-hosted LLM services with the goal of selling unauthorized access to jailbroken models.

LLMjacking attack vectors used by malicious actors to compromise public-facing LLMs include:

Credential Theft and Token Exploitation: The unauthorized use of stolen API keys and cloud credentials to run AI models.
API Abuse and Reverse Proxy Techniques: Reverse proxy techniques (such as OAI reverse proxy) to conceal unauthorized API access behind masked IP addresses and rotating domains.
Cloud Configuration Vulnerabilities: Flaws and security gaps in cloud services, for example, the Laravel vulnerability (CVE-2021-3129) that resulted in the LLMjacking of a public-facing instance of an Anthropic Claude model.

The reselling component of LLMjacking is a key distinction from earlier AI attacks that focused on prompt injection or data manipulation. LLMjacking incidents leave victims not only unaware of the intrusion, but also footing the bill.

LLM exploitation has become a growing threat. In July 2024 alone, 10X the number of LLM requests to the runtime service, Amazon Bedrock. The increase was corroborated by double the number of unique IPs requesting access over the first half of the year.

With more publicly accessible chatbots and cloud-based AI services than ever, the trend is likely to continue.

LLMjacking vs. Cryptojacking

LLMjacking is distinct from cryptojacking in that it targets LLMs specifically, rather than general compute resources. While both involve the unauthorized use of computing power, cryptojacking installs mining software that strains system resources and is easier to detect.

LLMjacking, on the other hand, leverages stolen API keys or compromised environments to covertly use LLMs for tasks like spam generation or bypassing usage limits. It’s harder to spot, often blending in with normal API traffic — especially in large organizations — making fine-grained monitoring and behavior-based detection more critical.

Runtime Defense Against LLMjacking with Upwind

Upwind monitors workloads at runtime to catch the earliest signs of LLMjacking. By combining real-time detection with deep context on container behavior, Upwind helps security teams isolate abuse, trace the source, and remediate faster than traditional scanning tools.

Get a Demo

Spotting LLMjacking: Three Key Signals

While LLMjacking is harder to detect than cryptojacking, since there’s no obvious CPU spike, there are still reliable ways to catch it. Runtime security is key to all of them.

Unusual Outbound API Calls to LLM Providers

Workloads that don’t typically make external requests to AI services may suddenly begin calling domains like api.openai.com or api.anthropic.com.

*A runtime CNAPP showing a previously quiet container initiating outbound calls to APIs should be flagged as behavioral drift.*

Credential Misuse Across Environments

A token issued for a dev environment may appear in production or be invoked from a suspicious location, suggesting the credentials were stolen and reused for LLM access.

Behavior that Doesn’t Match the Container’s Role

Containers can begin downloading AI models, issuing prompts, or running code unrelated to their declared function, like inference or scraping content.

Why These 3 Signals Matter

LLMjacking doesn’t leave traditional signs of compromise. There’s no ransomware announcement, no massive CPU load, and no outbound data exfiltration visible in a Security Information and Event Management (SIEM) tool, at least not right away. These 3 signals are key because they’re:

Observable at runtime, emerging only when workloads or identities are in use. They won’t get caught by static scans or Infrastructure as Code (IaC) reviews.
Tied to behavioral change, representing a clear deviation from typical workload behavior. That’s a key sign of compromise when the goal is API misuse, not resource theft or data breach.
Difficult to fake or disguise, with attackers unable to fully obfuscate where they connect, which credentials they use, and the fact that they’ve changed a container’s behavior.

Because of the nature of LLMjacking, a runtime CNAPP with behavioral analytics and network flow visibility is arguably the most capable single solution for detecting LLMjacking today. But it won’t prevent all cases. Here’s what else matters:

Layer	Why it Matters
Credential Hygiene	Multi-factor authentication (MFA), secret rotation, and limiting the scope of API tokens.
Billing Anomaly Alerts	Especially in Azure or GCP, which charge per-use for AI services, sudden spikes in AI resource usage can trigger alerts.
Outbound Traffic Controls	Egress restrictions, from firewall rules to a service mesh or VPC controls, can stop workloads from reaching LLM APIs that they shouldn’t.
Jailbreak Prompt Filtering	Detecting prompt/jailbreak attempts (via model or proxy layer) adds LLM-specific visibility.

Detection Strategies for Enterprise Security Teams

Once runtime signals point to possible LLM abuse, like abnormal outbound API calls, credential misuse, and behavioral drift, the next step is to scale detection across environments in order to block persistent abuse and account for the fragmented modern environment. Security teams should combine runtime visibility with traditional detection pipelines to build a layered defense against LLMjacking and AI threats.

Monitoring LLM Usage Patterns and Anomalies

LLMjacking thrives on credential misuse and API abuse that can blend in with legitimate traffic. To detect it, security teams should baseline normal LLM usage, including request volume, model access, and account activity, to flag anomalies.

Key indicators of compromise (IoCs) include sudden API spikes, irregular request patterns, or access from unknown IPs or dormant accounts. Integrating LLM logs with SIEM and cloud monitoring tools improves detection by correlating this behavior across the environment.

Advanced Threat Detection Techniques

Runtime tools provide a strong foundation, but detecting more persistent abuse calls for behavioral and API-level analysis. Think about adding:

Behavioral analysis to spot anomalies like repeated, slightly varied queries common in abuse or spam
Script-based anomaly detection, like unusual request headers, timing irregularities, and malformed payloads.
Cost-based alerting, especially in platforms like Azure or GCP, where unauthorized usage may first appear as a billing anomaly.

Response Playbooks for Security Operations

Detection is only the beginning. Effective incident response should be specifically designed for credential abuse, where cost, compliance, and abuse risks all converge.

The following elements are key to building a thorough and actionable response framework.

Activity	Actions
Immediate Containment	Rotate exposed API keys or credentials. Throttle or block suspicious traffic. Isolate affected workloads or environments
Root Cause Analysis	Correlate runtime and SIEM logs. Investigate how access was obtained. Identify shadow IT deployments or misconfigured access policies
Forensics and Evidence Collection	Secure logs. Preserve session data and access records. dentify other forensics and audit trails for supporting future investigations or legal action.
Remediation	Patch vulnerabilities. Tighten IAM and outbound egress controls. Update detection rules and runtime thresholds

Together, these detection and response strategies build on runtime signals that can point to first indicators of abuse, and make sure teams can scale detection, trace activity across environments, and act quickly. It all helps teams move beyond isolated alerts to a cohesive strategy for managing LLMjacking risk at enterprise scale.

The Final Step: Minimizing the Attack Surface

With detection and response strategies set, the final step is to shrink the space where LLMjacking can happen in the first place. That means minimizing the attack surface, for fewer ways that attackers can access, abuse, and escalate the use of Gen AI in the environment. Prevention at this stage isn’t about a single control. Instead, look to create a layered approach rooted in governance, access control, and visibility.

Enforce Credential Hygiene and Scope Limitation

LLMjacking succeeds because credentials work. The fewer valid keys attackers can use, and the more narrowly scoped they are, the harder abuse becomes.

Rotate API keys regularly and expire old or unused tokens
Use short-lived, workload-bound tokens via cloud-native IAM
Apply least-privilege principles, restricting model access to only approved users and workloads
Monitor where and how credentials are used to catch drift and lateral reuse

Govern LLM Usage Across Environments

Establish internal policies that govern who can use LLMs, which models get approved, and how those interactions are tracked.

Require tagging or registration of any workload using LLM APIs
Distinguish between development, production, and testing environments in terms of allowable LLM usage
Make sure LLM access policies align with your broader AI governance framework or risk management policies

Enforce Policy Through Infrastructure

Use preventative tooling to enforce security posture by design.

Apply IaC policies to block hardcoded API keys or unapproved egress
Leverage CNAPP or CSPM to detect misconfigurations that expose credentials or LLM access
Create outbound network policies that restrict connections to only explicitly approved AI service endpoints

Maintain Centralized Visibility

Security and compliance can’t secure what they can’t see, so unify dashboards and alerts around LLM usage.

Monitor which identities, workloads, and regions are making LLM calls
Track changes in volume and cost across clouds
Score risk based on behavioral analysis, accessibility, and data sensitivity

Minimizing attack surface reinforces detection, making LLMjacking more difficult to pull off. In combination with runtime detection and smart incident response, these controls help shift AI security from reactive containment to proactive management.

Real World Scenarios and Future-Proofing Against LLMjacking Threats

Recently, Chinese AI startup DeepSeek made headlines for its advanced Gen AI models (such as its DeepSeek-R1 reasoning model) that compete with top-tier systems like OpenAI’s GPT-4o in both performance and efficiency.

However, just as the DeepSeek AI assistant app was surging in popularity, the company was forced to temporarily halt new user registrations after a large-scale cyberattack targeted its systems. Cybercriminals acquired DeepSeek API keys, then used them to build unauthorized reverse proxy services. These proxies allowed other people to access DeekSeek’s LLM without going through the company’s own channels.

With the increasing use of sophisticated threats like reverse proxies, how can teams safeguard themselves and their defense systems from LLMjacking?

Prevent API Key Leakage at Scale

Enforce key-scoping and IP restrictions to bind API keys to specific services and environments. Rotate keys frequently and use short-lived tokens or workload-bound identities, as via AWS STS or Azure Managed Identity.

Jailbreak Automation and Model Abuse

Once behind a reverse proxy, users were free to prompt Deek Seek’s models without rate limits or content safeguards. Attackers tested and refined jailbreak prompts to produce phishing templates, malware code, and any other harmful outputs they desired. At scale. Deploy LLM-aware security tools to see and filter prompt and response behavior. Rate limit sensitive prompt categories, like code generation. Integrate output moderation and prompt fingerprinting into the API layer, even for internal use.

Protect Against Silent Abuse

Stolen access generates real charges, so set cost ceilings and billing alerts for key cloud platforms. Enable anomaly detection via CNAPP to identify unexpected LLM use spikes. Isolate LLM workloads in separate billing projects for visibility and containment.

As Gen AI grows, so does its appeal to attackers. Runtime visibility, along with credential security and policy enforcement, is also evolving to include LLMjacking-specific risks. By treating models like secure assets, companies can future-proof themselves against the next wave of prompt abuse and unauthorized access.

Upwind Protects Runtimes Against LLMjacking

Upwind helps security teams detect and stop LLMjacking in real time by monitoring runtime behavior, outbound traffic, and credential use in production. Whether it’s a container suddenly calling a Gen AI model or a stolen token used across environments, Upwind surfaces behavioral and identity anomalies that signal LLM abuse before it turns into expensive unauthorized use.

With deep identity context and automated response actions, Upwind doesn’t just alert teams to potential abuse. It enables teams to isolate abuse, trace root cause, and minimize AI-related risk across the cloud environment

To see how runtime-powered security can close the gaps that traditional tools miss, schedule a demo.

Frequently Asked Questions

What distinguishes LLMjacking from other cloud security threats?

LLMjacking stands apart from typical cloud threats by targeting the compute and access patterns of LLMs. Unlike cryptojacking, which hijacks hardware for mining, LLMjacking abuses exposed API keys or misconfigurations to covertly exploit AI services. The dynamic and distributed nature of LLMjacking makes detection harder, requiring behavior-based monitoring instead of traditional security signatures.

Which industries and AI use cases are most targeted by LLMjacking attacks?

Industries heavily invested in AI, like finance, healthcare, tech, and e-commerce, are top targets for LLMjacking. Attackers focus on environments where LLMs power customer-facing apps or automation tools, looking for chatbots, content generation, and code assistance to help conceal usage and maximize their chances that credential misuse goes unnoticed.

Organizations managing sensitive data through LLMs face even higher stakes, as their models are attractive to both opportunistic hackers and Advanced Persistent Threats (APTs).

How quickly can organizations typically detect unauthorized LLM usage?

LLMjacking detection often takes days or weeks, especially when attackers mimic normal usage. Without detailed logging, API monitoring, or cost alerts, many teams only catch it after billing spikes or after wrestling with performance impacts. Organizations with behavioral analytics and LLM-aware SIEM tools can shorten detection to hours, reducing impact and dwell time. Some key game-changers include:

Runtime CNAPP deployment
SIEM or billing alerts
Credential hygiene and scoping
Monitoring tools with LLM-specific observability or API-level tracking

What immediate steps should security teams take after detecting a potential LLMjacking incident?

When LLMjacking is detected, revoke exposed credentials immediately and isolate affected systems. Collect forensic data like API logs, access patterns, and model history to trace the attack vector and assess impact. Remediate vulnerabilities, review resource usage and data exposure, then update monitoring and key management. Finish with a post-incident review to improve defenses.

LLMjacking: What is it? And Why is it a Concern?

Understanding LLMjacking: Definition and Attack Fundamentals

What is LLMjacking? A Clear Definition

LLMjacking vs. Cryptojacking

Runtime Defense Against LLMjacking with Upwind

Spotting LLMjacking: Three Key Signals

Unusual Outbound API Calls to LLM Providers

Credential Misuse Across Environments

Behavior that Doesn’t Match the Container’s Role

Why These 3 Signals Matter

Detection Strategies for Enterprise Security Teams

Monitoring LLM Usage Patterns and Anomalies

Advanced Threat Detection Techniques

Response Playbooks for Security Operations

The Final Step: Minimizing the Attack Surface

Enforce Credential Hygiene and Scope Limitation

Govern LLM Usage Across Environments

Enforce Policy Through Infrastructure

Maintain Centralized Visibility

Real World Scenarios and Future-Proofing Against LLMjacking Threats

Upwind Protects Runtimes Against LLMjacking

Frequently Asked Questions

What distinguishes LLMjacking from other cloud security threats?

Which industries and AI use cases are most targeted by LLMjacking attacks?

How quickly can organizations typically detect unauthorized LLM usage?

What immediate steps should security teams take after detecting a potential LLMjacking incident?

LLMjacking: What is it? And Why is it a Concern?

Further Reading

Top GenAI Security Risks

What is Container Orchestration?

What Is Alert Fatigue in Cybersecurity?