AI-BOM, AI-Inventory, AI-NHI: A Practitioner’s Field Guide
Key Takeaways
- Closing the AI visibility gap takes three capabilities working together: AI-Inventory (every AI asset you have), AI-BOM (every component inside each one), and AI-NHI (every identity feeding them).
- AI-Inventory is the catalog. It answers what AI is running, where, and what is it exposed to.
- AI-BOM is the bill of materials. It answers what’s inside each AI workload, all the way down to the wrapper, the framework, and the model itself.
- AI-NHI is the identity map. It answers who is each AI workload running as, and what can it reach?
- Used together, these three answer the only question that matters: what’s actually running, and who’s responsible for it?
Closing the gap, in practice
The visibility gap post made the strategic case: cloud security tools were built for static infrastructure, AI workloads are runtime workloads, and the snapshot model can’t see them. The three capabilities that actually close it are AI-Inventory, AI-BOM, and AI-NHI, and I’ll walk through each one in order: what it is, what good looks like, and how to use it any day of the week.
A quick note on structure. AI-Inventory, AI-BOM, and AI-NHI are not three different products. They’re three views of the same runtime fabric, each answering a different question. You don’t pick one. You use all three, often in the same investigation.
A lot of the runtime AI security work my team has been doing this year, including the LLM prompt detection research we presented with Nvidia at RSAC, depends on this visibility layer as the foundation. So this is the post I wish someone had handed me before I started.
Let’s start with the one most teams reach for first.
AI-Inventory: every AI asset, with context
What it is
Think of AI-Inventory as the catalog. Every AI workload running in your environment, continuously discovered from runtime telemetry, classified by domain (AI & Machine Learning), and enriched with context: which technologies it’s using, what data it can reach, what it’s exposed to, and what risk indicators are firing on it right now.
The “continuously discovered” part matters. A static AI inventory built from tags or manual onboarding will be wrong in no time at all. Engineers spin up new workloads, deprecate old ones, swap frameworks. The inventory has to keep up without anyone filing a ticket.
What good looks like
You should be able to answer four questions about your AI estate in under a minute:
- How many AI workloads do I have, and what kind? Total count, broken down by resource type (EC2, GKE Deployment, AKS Deployment, EKS Pod, Lambda, etc.) and by AI domain (model serving, training, agentic, embedding, RAG).
- What technologies are they running? PyTorch, TensorFlow, LangChain, NLTK, spaCy, OpenAI SDK, Anthropic SDK, Bedrock SDK, Vertex SDK, MCP servers, vector DB clients. Versioned, with vulnerability status surfaced.
- Which ones are exposed? Internet-facing, exposed via internal load balancer, exposed to the wider VPC, or genuinely isolated.
- Which ones are touching sensitive data? Linked to data stores (S3, RDS, BigQuery, Snowflake) classified as PII, PHI, regulated, or business-critical.
If your inventory tool can’t answer those four questions on demand, it’s nothing more than an asset list.
How to use it
The single most useful AI-Inventory query is the one that filters for the workloads that actually matter for risk triage. Something like:
Filter:
domain = AI & Machine Learning
exposure = internet OR exposure = internal_lb
vulnerabilities.severity ≥ critical
data_access INCLUDES sensitive
This is a group that deserves priority attention. Internet-exposed or internal-LB-exposed AI workloads, running with at least one critical CVE, that can reach sensitive data. In most environments, that filter returns single-digit results. Those single-digit results are where you start.
Remember the GKE deployment from the visibility gap post? The customer support assistant, running LangChain, with read access to a customer data store, with a critical CVE in its model wrapper, reaching out to a third-party API nobody filed a vendor review for? That deployment surfaces in a query like the one above. You don’t have to know it exists to find it. You just have to know what shape of risk you’re hunting for.
A second query that earns its keep: net-new AI workloads in the last seven days. AI infrastructure changes faster than most cloud infrastructure, and a weekly review of “what AI showed up that wasn’t here last week” is a low-effort, high-signal habit.
AI-BOM: every component, all the way down
What it is
AI-BOM is the deep bill of materials for every AI asset in your inventory. Where AI-Inventory tells you that you have a LangChain workload running on a GKE pod, AI-BOM tells you which version of LangChain, which model wrapper it’s using, which underlying framework (PyTorch, TensorFlow), which serialized model artifacts it loaded, which MCP servers it’s communicating with, and which of those components have known vulnerabilities right now.
The two parts of AI-BOM that matter most are runtime scanning and static scanning, working in parallel. Static scanning catches what’s in the image, the manifest, and the requirements file. Runtime scanning catches what actually loaded into memory, including dependencies pulled at startup, models downloaded on first call, and frameworks that get patched in-place. AI workloads do all three of those things constantly, which is why static-only SBOMs miss so much.
What good looks like
A useful AI-BOM gives you three things:
- Per-resource breakdown. Pick any AI workload and see its full component tree: model, wrapper, framework, libraries, MCP server connections, third-party API calls.
- Cross-resource search. Pick any component and see every workload using it. Critical when a new CVE drops on a framework or wrapper and you need to know your blast radius in 15 minutes, not 15 days.
- Drift detection. Compare what’s running in production now against what shipped from the build pipeline. AI workloads are notorious for picking up additional dependencies at runtime. The drift report tells you what got added between deploy and now.
The cross-resource search is the one that pays for the tool. The first time a serious LangChain CVE drops and you can answer “which of our pods are running an affected version” in two minutes instead of two days, you’ll never go back.
How to use it
The query pattern that comes up most often is the cross-resource hunt:
Filter:
component.framework = LangChain
component.version < 0.4.2
resource.environment = production
Replace LangChain with whatever framework or wrapper has a fresh advisory. The result is the precise list of pods, deployments, or instances that need patching, ordered by exposure and data sensitivity if your tool ranks them that way.
[FOR MOSHE: OPTIONAL AUTHORIAL TOUCH, CONFIRM OR CUT]
The npm axios supply chain incident our team published on this March was the same shape of problem at the dependency layer. The speed of containment came down to having a usable BOM in front of us, not the speed of patching.
The second pattern that comes up: looking at a single high-risk workload and walking its full BOM tree to understand what’s actually inside it. This is the call you make when AI-Inventory has flagged something as critical and you need to know whether the criticality is the wrapper, the framework, an MCP server it’s talking to, or a model artifact it loaded from somewhere unexpected. The answer changes the remediation path.
The third pattern, less urgent but high-value: scheduled BOM diffs. Weekly or biweekly, pull the diff between this week’s AI-BOM and last week’s. New components appearing, version changes, MCP servers added. Most of it is normal development churn, but some of it is the workload you didn’t know was being built.
AI-NHI: every identity feeding the system
What it is
AI-NHI is the identity layer of the AI fabric. For every AI workload AI-Inventory has cataloged and AI-BOM has decomposed, AI-NHI maps the non-human identity it runs as: the service account, the workload identity, the IAM role, the API token, plus the data scopes that identity can reach and (where it exists) the human identity that owns it.
This is the layer that gets ignored most often, and it’s the one most likely to be the actual risk surface. Models and frameworks get patched. NHIs get created and forgotten. AI workloads make the problem worse because they tend to need broad read access to multiple data stores, and the easiest way to deliver that is an over-permissioned service account that nobody reviews.
What good looks like
The NHI view should let you pivot in three directions:
- From workload to identity. Pick any AI workload, see the NHI it runs as, see every permission that NHI has, see every data store it can reach.
- From identity to workload. Pick any NHI in your environment, see every AI workload using it. Highly useful for shared service accounts, which are common and risky.
- From identity to human owner. Pick any NHI, see the human owner (where one is assigned), see when the NHI was created, see when it was last reviewed.
If your NHI report can’t pivot all three ways, you can list every identity in your environment all you want. You still don’t know what any of them are doing.
How to use it
The query pattern that surfaces the highest-density risk is the “new and over-permissioned” hunt:
Filter:
identity_type = service_account
created_after = T-90d
used_by INCLUDES domain = AI & Machine Learning
data_access INCLUDES sensitive
human_owner = unassigned
Copied
That filter returns service accounts created in the last 90 days, used by AI workloads, with access to sensitive data, and no documented human owner. In most environments this is a tractable list. It’s also the cohort most likely to have been created in a hurry to ship a model and then never cleaned up.
The second pattern: orphaned NHIs. Service accounts that AI workloads were using last month but aren’t using this month. Either the workload is gone (and the NHI should be too), or the NHI got swapped (and the old one is sitting there with permissions and no purpose). Both are findings.
The third pattern, the one to run before any AI workload goes to production: the access certification view. For the NHI that workload will run as, what does it actually need to read, and what does it currently have access to? The delta between those two is the over-permissioning you can fix before you ship.
The three together
These three capabilities answer one question from three angles: what AI is running (AI-Inventory), what each workload is made of (AI-BOM), and who it’s running as (AI-NHI).
On a real investigation, you’d work through them in order. AI-Inventory surfaces the workload that matters: internet-exposed, critical CVE, sensitive data access. AI-BOM tells you what’s actually inside it: which framework, which wrapper, which MCP servers, which model artifacts, which versions. AI-NHI tells you what identity it’s running as, what that identity can reach, and who owns it.
In about ten minutes, you’ve gone from “we have a possible AI risk somewhere” to “this specific deployment, this specific component, this specific service account, owned by this specific person, needs this specific action.” That’s the architecture the visibility gap post called for, made operational, not just a bigger CSPM.
[FOR MOSHE: OPTIONAL AUTHORIAL TOUCH, CONFIRM OR CUT]
Every supply chain disclosure my team has published this year (Kubernetes, npm, GitHub Actions) came back to the same three questions when customers asked us how to scope it: what’s running, what’s it made of, and who’s running it as. This isn’t theoretical. It’s the conversation I’ve been having all year.
Where to start
Three things you can do in your existing environment, even before you pick up new tooling.
- Pull the AI-Inventory cohort. Filter for AI/ML workloads with internet exposure and at least one critical CVE. Whatever your current tooling lets you approximate, run that query. The result is your starting work queue.
- Pick one workload from that cohort and walk its BOM by hand. Open the manifest. Open the running container. List every framework, wrapper, model artifact, and outbound dependency. The exercise will tell you both what you have and what your current tooling isn’t showing you.
- Pull every service account created in the last 90 days that has data access to a store any AI workload reads from. Look at the human owner field. Count the empty rows. That number is the size of your unattributed NHI surface.
None of those three require a new platform. They require asking the right questions and allowing the answers to tell you exactly how much of the gap your current stack is closing, and exactly where it isn’t.
The Field Guide drops summer 2026
This is the fourth post in our AI Security launch series. Over the next several weeks, our threat research, CISO, and field teams will publish deep dives on each of the three pillars (View, Protect, Validate), culminating in AI Security in 2026: A Field Guide to View, Protect, Validate, our complete reference for the discipline.
Read more from the launch series
- Cloud Security and AI Security Stop Being Two Things, by Tomer Hadassi, COO, Upwind — https://www.upwind.io/feed/upwind-ai-security-launch-2026
- The 5 Hidden Challenges of Securing Enterprise AI in 2026, by Rinki Sethi, CSO — https://www.upwind.io/feed/ai-security-challenges-cisos-2026
- The AI Visibility Gap: Why You Can’t Secure What You Can’t See, by Jake Martens, Field CISO, with commentary by Rinki Sethi, CSO — https://www.upwind.io/feed/ai-visibility-gap-cspm-blind-spots
- AI-BOM, AI-Inventory, AI-NHI: A Practitioner’s Field Guide, by Moshe Hassan, VP Product & Research — https://www.upwind.io/feed/ai-bom-inventory-nhi-practitioner-field-guide
- Stop Prompt Injection at Runtime: Inside the Multi-Step AI Attack Chain, by Avital Harel, Security Researcher Team Lead — https://www.upwind.io/feed/prompt-injection-runtime-detection-ai-attack-chain
- Silent Data Bleed: How Unsanctioned AI Egress Drains Your Cloud, by Moshe Hassan — https://www.upwind.io/feed/silent-data-bleed-unsanctioned-ai-egress
- Why Testing AI Like Software Fails and What to Do Instead, by Rinki Sethi, CSO — https://www.upwind.io/feed/why-testing-ai-like-software-fails


