← All posts

Code-Level AI Governance · Part 1

You can't secure the AI you never inventoried

By Mike Carroll ·
You can't secure the AI you never inventoried

You cannot secure what you have not inventoried. Software learned that the hard way in December 2021. AI is about to learn it again.

A friend of mine, Bryan Patton, a Principal at Quest Software, dropped a word on LinkedIn that I have not been able to put down: AIBOM. AI Bill of Materials.

If you have worked in software security, you know the SBOM story. After Log4Shell tore through the internet, every buyer, every auditor, every regulator wanted the same answer: what is actually in your software? The SBOM went from nice-to-have to federal requirement in months. The logic was blunt. You cannot protect what you do not know you have.

That logic is hitting AI now, and it lands harder.

Why AI is harder than a dependency

Log4J was one library, buried a few levels deep, that turned out to be dangerous. Most teams could not answer “are we affected” because they had never inventoried their dependencies with any real granularity.

Now scale that up. Instead of one library you have multiple AI providers in your codebase, often OpenAI and Anthropic and a cloud model all at once. Model versions that change behavior without a deploy. SDKs that pull their own dependencies. API calls that route customer data to infrastructure you have never audited. And increasingly, agents that call other AI tools on their own through protocols like MCP.

The difference that matters: Log4J was running code. AI providers are data processors. They ingest your customers’ data, make decisions, and produce outputs your customers rely on. When a buyer’s security team asks “what AI are you running,” that is not a checkbox. It is a data sovereignty question, a liability question, and a “do you even know what is in your own product” question.

Most vendors cannot answer it. Not because they do not care. Because they have never built the inventory.

What an AIBOM actually covers

An SBOM lists software components. An AIBOM lists AI components, and the scope is wider than people expect:

  1. Provider inventory. Every AI service your product calls, including the ones that arrived through a dependency rather than a decision.
  2. Model registry. Which models, which versions, and what data classification rides on each. A model summarizing marketing copy is a different risk than one reading customer PII.
  3. Data flow. What goes to each provider, what comes back, and where it lands. This is where NIST AI RMF, ISO 42001, and the EU AI Act put most of their controls.
  4. Dependency chain. The orchestration layers between your code and the provider. A compromised AI dependency is a supply chain vector, the same way a compromised npm package is.
  5. Jurisdiction and export exposure. Which providers operate under which country’s data laws, and which sit on an export control list.

That last point is where the room gets quiet.

The part security teams stop and reread

Orchestration libraries like LiteLLM and LangChain make it trivial to add a provider and just as easy to lose track of one. I have scanned codebases where the team believed they ran two AI providers and the code said otherwise. You cannot govern what you cannot see.

The sharper version of the problem is jurisdiction. Several Chinese AI providers sit on the US Bureau of Industry and Security Entity List, where use can carry export-license requirements. Others operate under China’s AI regulations, which compel government access to training data and API logs, with no GDPR or CCPA compatible data processing agreement available. None of that surfaces in a SOC 2 audit, because SOC 2 was never built to ask what AI is in your code.

So I built the screen for it. The open-source AIBOM scanner I wrote flags providers by jurisdiction and checks them against the Entity List, because “we did not know it was in there” is not a defense a procurement team accepts.

Why a static SBOM does not cover this

SBOMs work because software components hold still. You install a package, it has a version, it stays until you update it.

AI does not hold still. Models update without a code change, so your dependency tree looks identical while behavior shifts under you. Routing is dynamic, chosen at runtime by config or cost. Data sensitivity changes per call, the same endpoint handling marketing copy one minute and customer records the next. A snapshot from last quarter is already fiction.

That is why I did not start with a policy document. I started with code.

The tool

The scanner is open source, zero-dependency Python, published on PyPI with a GitHub Action and SARIF output for code scanning. It detects 61 patterns across 30-plus providers, runs 34 risk rules, and maps findings to 48 controls across NIST AI RMF, ISO 42001, and the EU AI Act. It does not just grep for import openai. It traces orchestration layers and dependency manifests to surface every provider that could be invoked, whether anyone meant to invoke it or not.

Inventory comes before governance. Every time.

So here is the only question worth asking before your next enterprise questionnaire lands: if a buyer asked you today to list every AI provider in your product, with models, versions, and jurisdictions, could you answer with evidence, or with a promise to get back to them?

Part 1 of a three-part series. Part 2, your SOC 2 dashboard is green and still cannot answer the AI question, looks at why the compliance stack you already own does not close this gap.

Sources

AI governanceAIBOMSBOMsupply chain securityChinese AIAI inventory