How does AI make insecure code suggestions harder to catch during code review?

AI output typically reads as more authoritative than a half-finished human draft, making reviewers more likely to accept it without the same level of scrutiny. A suggestion can run, pass a quick test, and still carry an injection-prone query, weak input handling, or an outdated cryptographic choice. If the reviewer is checking whether the code works rather than whether it is safe, the security review never actually happens. Review standards need to explicitly name the things AI makes easy to miss — secrets, injection risks, scope creep — not just functional correctness.

What is prompt injection and when does it become a real threat in an engineering context?

Prompt injection is when a model is steered by text it was only supposed to read — OWASP lists it as the top risk for LLM applications. In an engineering workflow, the threat becomes real as soon as an agent reads untrusted content such as an issue description, a web page, a dependency's documentation, or external file output. If that same agent also has permissions to open pull requests, run commands, or call APIs, it can be manipulated into performing actions the engineer never intended. The dangerous combination is an agent that both reads untrusted input and can act with broad permissions.

Why is 'a human reviews it' not a sufficient security control for AI-generated code?

Human review only functions as a security control when the reviewer knows specifically what they are validating. For AI-assisted workflows, that means the checklist must explicitly ask: does this change introduce or expose a secret, does the code follow injection-prevention rules, did the agent act outside intended scope, and was untrusted content allowed to influence a privileged action. Without those named checks, a team has speed but not control — a state that feels safe while it is not. Least privilege for agents and scoped tokens are the structural safeguards that reduce reliance on review catching everything.

How to keep AI-assisted coding secure

When engineering leaders ask us about AI security, the question is usually framed around a dramatic scenario: a model leaking proprietary code, or an agent going rogue.

Those are worth thinking about. But they are not where most teams actually get hurt.

In our experience, the realistic risk is quieter. AI does not invent new categories of mistake. It lets engineers reach ordinary mistakes faster and accept them with more confidence. A secret committed to a repository, an insecure suggestion merged because it looked plausible, a tool given broad access for a narrow task.

GitHub wrote on April 1, 2025 that more than 39 million secrets were leaked across GitHub in 2024. That number predates the heaviest wave of AI-assisted coding. It is a useful baseline: developer convenience and security discipline already did not move together. Adding faster code generation does not automatically fix that, and can quietly widen the gap.

So the goal of this article is not fear. It is a practical control model you can actually hold engineering teams to.

The three security questions that matter

Before any tooling discussion, we want three questions answered for each approved AI workflow.

What data can this workflow expose, and to where?
What kind of unsafe output could a reviewer wave through?
How much access does the tool or agent actually have, versus how much the task needs?

If a team cannot answer those three, the workflow is not secured. It is tolerated.

Risk 1: secrets and data exposure

The first risk is the most boring and the most common.

AI coding tools work better with context, so engineers feed them more: files, logs, configuration, sometimes whole repositories. Each of those is a path for a secret or sensitive value to leave a controlled boundary, either into a third-party model or into a generated artifact that later gets committed.

The controls here are not new. They are the discipline we already knew we needed:

secret scanning on commit and in CI, enforced rather than advisory
clear rules about which repositories and data classes may be shared with which tools
short-lived credentials over long-lived static secrets wherever possible
a named rule for what must never be pasted into a prompt or shared context

The point is not that AI created secret leakage. It is that AI increases the number of moments where a secret can move, so the existing controls have to be real rather than aspirational.

Risk 2: insecure code suggestions accepted too easily

The second risk is acceptance.

A model can produce code that runs, passes a quick test, and still carries an insecure pattern: an injection-prone query, weak input handling, an outdated cryptographic choice, or a dependency that should not be there.

The danger is not that the suggestion exists. It is that AI output often reads as more authoritative than a half-finished human draft, which makes a reviewer more likely to accept it without the same scrutiny.

This is where a human reviews it quietly fails. If the reviewer is checking whether the code works, not whether the code is safe, then the security review never actually happened.

Risk 3: prompt injection and untrusted context

The third risk is the one most specific to AI systems, and the one teams understand least.

The OWASP Top 10 for Large Language Model Applications lists prompt injection as its first entry. In an engineering context, this matters as soon as your AI workflow reads untrusted content: an issue description, a web page, a dependency's documentation, a file from an external source, or the output of another tool.

If a model can be steered by text it was only supposed to read, then an agent with the ability to act, open a pull request, run a command, call an API, can be steered into doing something the engineer never intended.

You do not need to be building a frontier product for this to apply. You only need an agent that both reads untrusted input and has permissions.

Mapping the risk to the control

We find it clearer to keep the model on one page.

Risk	What goes wrong	Primary control
Secret and data exposure	Sensitive values leave a controlled boundary	Enforced secret scanning, data-sharing rules, short-lived credentials
Insecure suggestions	Unsafe code accepted because it looks authoritative	A security-specific review standard, not just `does it work`
Prompt injection	Untrusted input steers a model or agent	Treat all read content as untrusted; constrain agent permissions
Over-broad access	A tool can do more than the task requires	Least privilege for tools, agents, and tokens

None of these controls are exotic. The work is making them explicit and enforced, not assumed.

Why `a human reviews it` is not a security control by itself

We push back hard on this phrase, because it is doing too much work in most rollouts.

Human review is only a security control if the reviewer knows what they are validating. For security, that means the review standard has to name the things AI makes easier to miss:

Does this change introduce or expose a secret?
Does this code follow our input-handling and injection-prevention rules?
Did an agent touch anything outside the intended scope?
Was any untrusted content allowed to influence a privileged action?

If those questions are not on the reviewer's checklist, the team has speed without a control. That is the state we most want engineering leaders to avoid, because it feels safe while it is not.

This is the security version of a point we make about approved workflows generally: human oversight only counts when the reviewer can name what they are checking.

Least privilege for AI tools and agents

The strongest single habit we recommend is least privilege, applied to tools and agents the same way you would apply it to a service account.

Concretely:

give an AI tool access to the repositories and data the workflow needs, not the whole estate
scope tokens narrowly and rotate them
separate read and suggest workflows from act and execute workflows, and review the second class far more strictly
require human confirmation before an agent performs an irreversible or outward-facing action

An agent that can only read and propose is a manageable risk. An agent that can read untrusted input and also act with broad permissions is the combination worth designing against.

This aligns with the direction of regulation as well. The EU AI Act, Regulation (EU) 2024/1689, treats human oversight as a real obligation in Article 14, and NIST's AI Risk Management Framework frames security and resilience as core properties of trustworthy AI systems rather than optional extras.

What we would measure

You cannot manage AI security from a single dashboard, but a small set of signals tells you whether the controls are real.

Signal	What it tells you
Secret-scanning coverage and catch rate	Whether the most common failure is actually contained
Security findings in AI-assisted changes	Whether reviewers are applying the security standard
Agent permission scope per workflow	Whether least privilege is enforced or aspirational
Untrusted-input pathways into agents	Where prompt injection risk actually lives

Silence is not a signal of safety here. If you are catching nothing, the most likely explanation is that no one is looking, not that nothing is happening.

Our view

The healthiest way to think about AI security in engineering is not as a special new domain. It is as a stress test of controls you should already have.

AI raises the stakes in three ordinary places: it moves more data, it makes unsafe suggestions easier to accept, and it introduces untrusted input as a steering risk. The teams that handle this well are not the ones with the most sophisticated tooling. They are the ones who made their controls explicit, scoped agent access to the task, and rewrote the review standard to name what AI makes easy to miss.

That is unglamorous work. It is also what lets a leader say, honestly, that the speed they gained did not quietly move the organization into a weaker security state.

Sources

GitHub, GitHub found 39M secret leaks in 2024. Here's what we're doing to help, April 1, 2025
OWASP, OWASP Top 10 for Large Language Model Applications, accessed 2026-06-10
NIST, Artificial Intelligence Risk Management Framework (AI RMF 1.0), accessed 2026-06-10
EUR-Lex, Regulation (EU) 2024/1689, Article 14

Frequently asked questions

Why does AI-assisted coding increase the risk of secret leaks even when developers follow existing guidelines?: AI tools perform better with more context, so engineers naturally feed them more — files, logs, configuration, sometimes entire repositories. Every additional input is a new path for a secret to leave a controlled boundary, either into a third-party model or into a generated artifact that later gets committed. GitHub reported 39 million secret leaks across GitHub in 2024, before the heaviest wave of AI-assisted coding even arrived. AI multiplies the number of moments where a secret can move, so secret scanning must be enforced rather than advisory.
How does AI make insecure code suggestions harder to catch during code review?: AI output typically reads as more authoritative than a half-finished human draft, making reviewers more likely to accept it without the same level of scrutiny. A suggestion can run, pass a quick test, and still carry an injection-prone query, weak input handling, or an outdated cryptographic choice. If the reviewer is checking whether the code works rather than whether it is safe, the security review never actually happens. Review standards need to explicitly name the things AI makes easy to miss — secrets, injection risks, scope creep — not just functional correctness.
What is prompt injection and when does it become a real threat in an engineering context?: Prompt injection is when a model is steered by text it was only supposed to read — OWASP lists it as the top risk for LLM applications. In an engineering workflow, the threat becomes real as soon as an agent reads untrusted content such as an issue description, a web page, a dependency's documentation, or external file output. If that same agent also has permissions to open pull requests, run commands, or call APIs, it can be manipulated into performing actions the engineer never intended. The dangerous combination is an agent that both reads untrusted input and can act with broad permissions.
Why is 'a human reviews it' not a sufficient security control for AI-generated code?: Human review only functions as a security control when the reviewer knows specifically what they are validating. For AI-assisted workflows, that means the checklist must explicitly ask: does this change introduce or expose a secret, does the code follow injection-prevention rules, did the agent act outside intended scope, and was untrusted content allowed to influence a privileged action. Without those named checks, a team has speed but not control — a state that feels safe while it is not. Least privilege for agents and scoped tokens are the structural safeguards that reduce reliance on review catching everything.