Skip to content

Stay connected w/ me.

me [at] borghei [dot] me

article9 min read

When Your AI Agent Gets It Wrong

Everyone is deploying AI agents. Almost nobody has decided who owns the call when one gets it wrong. The accountability gap and how to close it.

It's 2am. A patient sends a WhatsApp message describing chest tightness and difficulty breathing 48 hours after an IV therapy session. The AI agent reads the message, classifies it as mild post-treatment discomfort, and sends back a reassurance template with hydration tips.

The care team never sees it.

This is the scenario we designed around when we built the agentic patient support system at Reviv Italy. Not because it's inevitable. Because it's possible, and possible in healthcare means you plan for it before you ship, not after. The agent needed to know exactly which signals should trigger an immediate escalation to the on-call clinician, which ones it could handle autonomously and what happened when it couldn't tell the difference. Designing that logic took longer than building the agent. Writing it down, testing it and getting the medical team to sign off on it took longer still.

Most companies are skipping this step. And right now, more companies than ever are deploying agents.

Gartner projects that 40% of enterprise applications will include task-specific agents by end of 2026. IDC puts year-over-year AI spending growth at 31.9% through 2029. Half of enterprises are already running agentic systems in some form. The market has decided agents are happening. The question that hasn't been answered at most of those companies is simpler and harder: when the agent gets it wrong, who owns the consequence?

That's not a philosophical question. It's an operational one. And until you answer it, you don't have an agent deployment. You have a liability waiting to surface.

The shift most teams are missing

The way most product and engineering teams talk about agents treats them as an extension of automation. You had a human doing X. Now an agent does X faster. The human moves on to higher-value work. Productivity goes up. Done.

That framing is wrong in one specific and consequential way. When a human does X, there's a person accountable for the output. If the output is wrong, you know who to go to, who has the context, who makes the correction. The accountability is baked into the organizational structure.

When an agent does X, the accountability doesn't automatically transfer. It evaporates, unless someone actively names where it goes.

This is what I mean by accountability reassignment. Deploying an agent isn't just automating a task. It's moving a decision from a person, who has implicit ownership, to a system, which has none. The system doesn't feel the consequence of being wrong. It doesn't course-correct based on a bad outcome. It runs the same logic next time unless someone changes it. For that to happen, someone needs to notice the mistake, understand why it happened and have both the access and the authority to fix it.

Most agent deployments don't define that person. They define the agent.

Three ways this goes wrong

The agent gets blamed when it's the design that's wrong. An agent behaves exactly as programmed and produces a bad outcome. The team investigates, discovers the agent "made an error" and treats it as a model failure. They fine-tune the prompt, tweak the workflow, redeploy. Nobody asks whether the agent should have been making that decision at all. The design problem stays invisible because the post-mortem focused on the output rather than the decision boundary.

The escalation path exists on paper but not in practice. The team builds a fallback: when confidence is below a threshold, the agent hands off to a human. The threshold gets set once, at deployment. Six months later, the volume has doubled, the human queue is backlogged and the agent has been silently overriding its own fallback logic because someone adjusted the threshold to reduce the queue. The override looks like an optimization in the dashboard and like a governance failure in the incident report.

Nobody owns the agent's decisions in the org chart. The ML engineer owns the model. The product manager owns the feature. The operations team owns the workflow it runs in. The legal team owns compliance. When the agent produces a bad outcome, the response is a meeting with all four functions, three weeks of Slack threads and a RACI exercise. The agent keeps running in the meantime.

All three of these are symptoms of the same root problem: the team designed the agent's behavior without designing the accountability structure around it.

The three questions every agent deployment needs answered before it ships

These aren't questions to put in a design doc and forget. They're questions with named owners, written decisions and a review cadence. They get updated when the agent's scope changes.

1. What decisions is this agent making, and which ones should it never make?

This sounds obvious. It isn't. Most agent specs describe what the agent does, not where its authority ends. "Handles patient inquiries" is a task description. The actual decision boundary looks like this: the agent can answer questions about treatment protocols, appointment logistics and post-care guidance. It cannot make clinical assessments, modify care plans or classify symptoms as emergencies. Any message containing a specific set of symptom keywords routes immediately to the on-call clinician with no agent response.

That boundary needs to be written, reviewed by everyone whose domain the agent operates in and signed off before deployment. It also needs to be tested. Not just for the cases the agent handles well. For the edge cases where it would get it wrong.

2. When the agent makes a wrong call, who finds out and how fast?

This is the monitoring question, and most teams answer it with a dashboard. Dashboards don't page anyone at 2am. Define the failure modes. For each one, define the alert: who gets it, through what channel, within what time window. For high-stakes agents, the answer to "who finds out" should be a named person with a backup, not a team.

The monitoring architecture should be designed around the severity of the failure, not the convenience of the tooling. A wrong product recommendation in an ecommerce flow is a different risk profile from a wrong clinical classification in a healthcare agent. Design the alert structure accordingly. The alert for a critical failure should be harder to miss than the thing it's alerting about.

3. Who can change the agent's behavior, and under what conditions?

This is the hardest question because it cuts across functions. The ML engineer who fine-tunes the model and the product manager who adjusts the workflow are both changing the agent's behavior. If they do it independently, without a shared decision record, the agent's behavior drifts in ways nobody intended and nobody can explain.

Define the change process before you deploy. Not a heavyweight approval chain for every prompt tweak. A lightweight record: what changed, why, who approved it, what was the expected effect and what does the monitoring show. This is the operating document for the agent, the same way a product operating document is the thing that survives when the Slack thread doesn't.

Why pilots don't scale

The pattern I've seen across companies of different sizes, in healthcare, fintech and consumer products, is that agent pilots succeed and production deployments struggle. The pilot worked because someone on the team was watching it obsessively. They noticed the edge cases, caught the wrong outputs before they reached users and iterated fast because they were close to the work. The accountability structure was one person with context and urgency.

Then the pilot got handed off to production, spread across a larger user base and put on the roadmap as "done." The obsessive watcher moved to the next pilot. The agent kept running. The accountability structure evaporated.

Scaling an agent isn't a model problem. It's an organizational design problem. The question isn't whether the agent can handle the volume. It's whether the accountability structure around it can.

Bain's research on decision effectiveness shows that decision speed and quality is the single strongest predictor of financial performance across industries. A bad agent decision made at high speed and scale is a faster way to produce a bad outcome than a bad human decision made more slowly. Speed is only an advantage when the decision quality is there. And decision quality, for an agent, depends entirely on whether someone designed the right boundaries, monitors against them and owns the consequence when they're breached.

What to do with an agent that's already in production

If you have agents running and the accountability structure is unclear, start with the decision audit. For each agent, write down what decisions it's making today, not what the spec says it should make. Observe it. The gap between the spec and the reality is where the risk lives.

Then name an owner. One person, not a function. That person's job is not to monitor the dashboard. It's to be the one who gets the call when something goes wrong, who has the authority to change the agent's behavior and who reports out on the agent's performance the same way a product manager reports on a product's metrics.

Then write the decision boundary down. Not to constrain the agent. To protect the deployment. An agent with a clear, documented boundary is defensible when something goes wrong. An agent without one is not.

The companies getting ROI from agents aren't the ones with the most sophisticated models. They're the ones that treated agent deployment the way they'd treat hiring a person into a critical role: with a clear job description, a reporting structure and a way to know when the job isn't being done right.

The agent is only as trustworthy as the accountability structure around it.

If you're shipping agents into production and the accountability picture isn't clear yet, get in touch.


This is the fourth in a weekly series on what I see in the market and hear from operators across the companies I've worked with. Next week: why most sprint reviews are a ceremony without a consequence, and what a real accountability loop looks like.