Secured the Access. Trust Is a Separate Question.

Secured the Access. Trust Is a Separate Question.

A few weeks ago I argued that the right response to AI-powered offense is the same thing it's always been: follow the data, apply foundational controls, and stop looking for a clever AI-specific answer to what is fundamentally a hygiene problem. Access control. Least privilege. System inventory. Configuration hardening.

That holds. None of it changed.

But there's a layer worth naming, because it's easy to skip when you've been heads-down on the access layer. We ask who can reach the tools. We don't always ask whether the tools themselves can be trusted.

Those are different questions.

We secured the pipes. We secured who can turn the valves. The next question is whether we trust what's flowing through them.

The Model Is an Attack Surface

When you deploy a model, you're deploying software built from data. Some of that data was collected and curated by people you don't know, from sources you haven't audited. Pre-training happens at a scale where systematic contamination, even subtle contamination, is difficult to detect before deployment. Fine-tuning is faster and more targeted, which makes it a more attractive poisoning window.

Poisoned models don't announce themselves. They perform normally across the vast majority of inputs. They behave unexpectedly on specific ones: particular keywords, sequences, contexts the attacker anticipated and you didn't.

The foundational control here is provenance. Know where your model came from. Audit fine-tuning datasets the same way you audit third-party code. Log inference behavior the same way you log privileged user activity. Anomaly detection on model outputs is the audit trail you didn't know you needed.

The Supply Chain Got Longer and Faster

AI accelerates development velocity. It also accelerates the path from compromised dependency to production.

The libraries and packages your AI-assisted pipeline touches are getting written, reviewed, and shipped faster than before. A developer who trusts their coding assistant's package suggestions is extending trust to a chain they can't fully see. That chain includes packages with names designed to be confused with legitimate ones, repositories with recent contributors who aren't who they say they are, and open-source projects that changed behavior in a minor version nobody closely reviewed.

The foundational control is the one that predates AI: verify the source, pin your dependencies, run automated analysis on third-party code before it reaches production. AI didn't create this risk. It increased the velocity at which unreviewed dependencies move from "suggested by a tool" to "running in your environment."

The Developer's Own Environment Is A Target

This one is easy to overlook. It deserves more attention than it's often getting.

Skills, hooks, and MCP servers are the current developer toolchain. They're also instruction injection and code execution paths that get installed, trusted, and run at the level of the developer's own session. A malicious MCP server doesn't need to breach your perimeter. It needs to be installed by someone who trusted the source. A compromised skill or hook does what it claims, and something else. We've seen the malicious npm packages and know to expect them. Now expect the same from skills and hooks.

Prompt injection used to be a model risk. Now it's a toolchain risk. An attacker who can influence what an AI coding assistant reads, through a compromised plugin or a poisoned context hook, can influence the code that gets written and the commands that get run. On the developer's machine. Inside their session. With their credentials.

The foundational control is the same one you apply to browser extensions and IDE plugins: treat them as code with privileged access, because that's exactly what they are. Audit what's installed. Review what each component can reach. Least privilege applies to AI toolchain components, not just human accounts.


We secured access to the pipes. That was the right first step.

The next question isn't a new kind of question. It's whether we trust what's in them: the models, the dependencies, the tools running inside the developer's own session. Different surfaces. Same framework.

Trust is earned through verification. You already know how to verify. Apply it here.

Subscribe to ClearText

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe