Summary
- A free-roaming agent fails on security before it fails on accuracy - OWASP lists prompt injection as the number-one LLM risk and “excessive agency” at number six. An agent that can read anything and act on anything is the textbook setup for both.
- The fix is structural, not a better prompt - bind the agent to a defined workflow that scopes which tools it touches, at which step, with a human sign-off where the stakes are high. The workflow becomes the permission layer.
- The market is already correcting - Gartner expects over 40% of agentic AI projects to be canceled by the end of 2027, drawn from a poll of more than 3,400 organizations. Unclear value and weak risk controls are the named causes.
- Want to scope AI to one process safely? Map the workflow first, then connect the agent to those exact steps. Start with one workflow in Tallyfy
An IT lead on r/sysadmin laid out the kind of story that makes security people wince. First the company hit the “just connect the LLM to everything” phase, and it went about how you’d expect: an assistant cheerfully surfaced sensitive legal documents to people who had no business reading them. Now leadership has moved on to the next shiny thing. They want AI agents that can trigger real workflows across the business. The poster’s question was simple, and a little desperate. How do you get ahead of the agent risk before it turns into an incident?
My answer is short, and then I’ll defend it. Don’t let the agent roam. Bind it to a defined workflow that says exactly what it can touch, exactly when, and exactly which steps need a human to sign off first.
An agent with the run of your systems is a postmortem waiting to happen. An agent scoped to one process, with named steps and approval gates, is closer to a fast employee who can’t go off-script.
The timing is what makes this land so hard. Every leadership team has seen the same agent demo, the one where a model books travel, answers email, and looks like the future. Almost none have watched what that agent does in week six, when an edge case it never met walks in the door and it improvises against your production systems. A demo runs on rails someone laid for it. A deployment runs on your live data, with real attackers and real edge cases, and that gap is exactly where the risk lives. So the question the r/sysadmin poster asked, how do I get ahead of this, is the right one to be asking, and it’s far cheaper to answer before the rollout than after the cleanup.
Workflow Automation Software Made Easy & Simple
What breaks when you just connect the LLM
The legal-docs leak wasn’t a freak event. It’s the predictable result of wiring a model into systems it can read freely. Security researcher Simon Willison calls the dangerous pattern the lethal trifecta: give an AI system access to your private data, expose it to untrusted content, and let it communicate externally, and “an attacker can easily trick it into accessing your private data and sending it to that attacker.” Most “connect the LLM to everything” projects hand the model all three on day one. Nobody decides to build an exfiltration path. It just falls out of the convenience.
Walk the leak through that frame and it’s almost tidy. The assistant could read the document store, which is the private data. It ingested whatever users pasted, which is the untrusted content. And it answered freely in a shared channel, which is the external communication. Three boxes ticked, no attacker even required, just a careless question and an over-eager model. The lesson isn’t that the team was sloppy. It’s that “connect the LLM to everything” is the trifecta as a default setting, and you have to design your way back out of it on purpose.
Prompt injection is the mechanism, and it’s not exotic. A poisoned email, a booby-trapped PDF, a comment buried in a shared doc: any of them can carry instructions the model treats as commands. The thing is, a model can’t reliably tell your instruction from an attacker’s, because to the model it’s all just text, basically. That’s why “we’ll write a stricter system prompt” is a tough sell as a defense. You’re trying to out-argue every attacker who will ever touch your data, and you only have to lose once.
A free-roaming agent fails the OWASP test
Run a roaming agent against the standard security checklist and it fails on the first page. OWASP’s Top 10 for LLM Applications ranks prompt injection as LLM01, the single most likely thing to go wrong, and lists “excessive agency” as LLM06: a system granted too much autonomy and too many permissions, free to act beyond what anyone intended. A free-roaming agent is excessive agency by design. You built the vulnerability into the architecture and then asked the prompt to please not exploit it.
OWASP splits excessive agency into three flavors worth knowing by name: too many permissions, too much autonomy, and too much functionality. A roaming agent tends to collect all three at once. It can reach systems it never needs, it acts with no checkpoint, and it’s wired to tools far beyond its actual job. Tightening the prompt touches none of that. A prompt is a request. Permissions are a fact. The day an injected instruction overrides your careful wording, and that day arrives, the only thing between the model and your data is what you really let it reach.
So the instinct to fix this with smaller silos or cleverer wording misses where the problem lives. The danger isn’t that the model is dumb. It’s that it’s capable and unconstrained at the same time.
The market noticed. Gartner expects more than 40% of agentic AI projects to be canceled by the end of 2027, pulling from a poll of over 3,400 organizations, and the cancellations trace back to unclear value and weak risk controls, not to the agents being incapable. Capable is the easy part.
Capable with no fence around it is the incident.
Workflow is the permission layer
So where does the fence come from? You already own the answer, and it’s the most boring tool in the building: a defined workflow. A workflow is a sequence of steps, each with an owner, an input, and a rule for what happens next. Once the work is shaped that way, you can scope the agent to a single step, hand it only the tools that step needs, and require a human to approve before the process moves to anything irreversible. The workflow stops being just documentation. It becomes the thing that decides what the AI is allowed to do.
This maps cleanly onto how security people already think. The NIST AI Risk Management Framework organizes the whole problem into Govern, Map, Measure, and Manage, and the oldest principle in the book still applies: least privilege. Give the agent the narrowest access that lets it finish its one step, and nothing more. A bounded process is how you do that without writing a security essay for every task. We built Tallyfy’s automation rules around exactly this idea, because a process that already names every step and owner is a process you can safely point an agent at.
Picture each step as a door with its own key. The intake step can read the form and nothing else. The approval step is the only place a human releases a payment. Your agent gets the key to one room, used at one moment, and the process logs every time a door opens. That’s least privilege expressed as a process instead of a config file, and it’s far easier for a non-engineer to reason about. When an auditor asks what the AI can do, you don’t hand them a permissions matrix nobody can read. You hand them the workflow, because the workflow is the answer.
So what does a bounded agent actually look like?
Picture an invoice-exception agent, the kind leadership keeps asking for. The free-roaming version gets read and write access to the finance system and a cheerful “go save us time.” The bounded version lives inside an accounts-payable workflow and does precisely one thing: at the review step, it reads each invoice, flags the line that looks off, and writes a note. It cannot send a payment. It cannot touch a vendor record. The next step is a human who approves or rejects, and only then does the process continue.
Same model, same intelligence, radically different blast radius.
The same shape works anywhere the pressure to “add an agent” shows up. Take customer support. The roaming version gets your whole helpdesk and a prayer. The bounded version reads an incoming ticket at the triage step, suggests a category and a draft reply, and stops. A human edits and sends. The agent never closes a ticket on its own, never issues a refund, never touches a record it wasn’t handed. You get the speed of a model on the reading-and-drafting part, and zero new ways for a rough afternoon to become a breach.
That’s also how modern AI should connect to a process in the first place: through a Model Context Protocol server that exposes only the steps and data the agent is cleared for, rather than a raw key to the whole system. The agent acts inside the tracked workflow, every action lands in the audit trail, and the approval gate is a real step, not a hopeful instruction. This is the practical shape of AI as workflow infrastructure: the agent supplies judgement on one bounded task, the process supplies the guardrails. An AI agent without a workflow has nothing to stand on, so it improvises, and improvising is exactly what you don’t want near payments or client data.
Start narrow, then widen on evidence
A mistake we made early on was assuming the safe rollout was the slow one. It’s the opposite. The fastest way to get an agent into production without a 2am phone call is to start it read-only. Let it read and suggest at one step, with a human approving every move, and watch what it actually does for a couple of weeks. You’ll learn more from ten real runs than from a month of threat-modeling in a doc, because real inputs surface the edge cases your imagination skips.
The ladder is simple enough to sketch on a napkin. Stage one, the agent only reads and suggests, and a person does every action. Stage two, it can complete one low-stakes step on its own, with the result logged and reversible. By stage three, and only after weeks of clean runs, it earns a second step. You never jump straight to autonomy, and you never grant write access to anything irreversible without a human gate in front of it. Each rung earns the next. If a step misbehaves, you drop it back a rung instead of tearing the whole system out, which is what happens to the teams that handed over the keys on day one.
What nobody warned us about is how much this calms the room. Once leadership can see the agent confined to one workflow, with an approval gate they control and an audit trail they can read, the “is this safe?” anxiety drops away, because the real answer is yes, by construction. Point an AI at a vague, undefined job and you’ve bought a liability with a friendly chat interface. Point it at a single defined step inside a process you already trust and it inherits that process’s safety instead of inventing its own.
That’s the quiet payoff nobody puts in the agent pitch. A bounded agent is easy to approve, because there’s something concrete to approve: this step, this access, this gate. A roaming one asks for trust you have no real basis to give, and the smart people in the room can feel it, which is why those projects stall in committee while the scoped ones ship.
So before you green-light an agent that can trigger workflows, do the unglamorous part first. Map the workflow. Name every step and owner. Mark the irreversible steps and put a human on them. Then connect the agent to the two or three steps where a model earns its keep, the reading, the classifying, the drafting, and leave the rest alone. The agent was never the hard part. The fence around it is the work, and it’s work you can finish this quarter instead of explaining to a regulator next year.