Exception Handling and Human-in-the-Loop: Making AI Agents Resilient

Q: What is exception handling in an AI agent?

Exception handling in an AI agent is the set of mechanisms that let it detect, contain, and recover from failures without derailing the whole task. It covers environmental failures, like a tool or API timing out, which are handled with retries, fallbacks, and circuit breakers, and the agent's own mistakes, like a bad plan or invalid output, which are caught with output validation and self-critique. The goal is for the agent to degrade gracefully instead of failing silently or acting on a mistake.

Q: What is human-in-the-loop (HITL) in AI agents?

Human-in-the-loop is a pattern where an AI agent routes certain decisions to a person before acting, rather than running fully autonomously. A human belongs in the loop in three situations: when the cost of an error is high or irreversible, when the model's confidence in its answer is low, and when the user's intent is ambiguous. The most common implementation is an approval gate, where the agent pauses before a high-risk action and resumes only after a human confirms it.

Q: Where should you put a human in the loop for an AI agent?

Place the human exactly where the checkpoint earns its cost, not in every step. The three high-value spots are before irreversible or expensive actions such as sending money or deleting data, on cases where the agent reports low confidence so it escalates instead of guessing, and on ambiguous requests where a clarifying question beats a confident wrong answer. Gating low-stakes, reversible steps the agent handles reliably just adds friction without reducing risk.

Space & Story Team

Part ofAgentic Design Patterns: The Complete Guide to Building Intelligent AI Systems

Based on Agentic Design Patterns by Antonio Gulli (Springer). All book royalties go to Save the Children.

agentic design patterns

retries and fallbacks

circuit breaker

Antonio Gulli

Space & Story Team·June 15, 2026·10 min read

Exception Handling and Human-in-the-Loop: Making AI Agents Resilient

Key Takeaway

Resilient AI agents need two things. Exception handling recovers from failures on its own, through retries with backoff, fallbacks, and circuit breakers for tool and API errors. Human-in-the-loop checkpoints bring a person in exactly where the cost of error is high, model confidence is low, or intent is ambiguous.

Why This Matters for Enterprise AI

An agent that works in the demo and falls over in production has one thing in common with every other piece of software ever shipped: the happy path was the easy part. The hard part is what happens when a tool call times out, an API returns a 500, the model misreads its own output, or the request is ambiguous enough that any answer is a guess. A demo agent assumes none of that happens. A production agent assumes all of it will.

Exception handling and human-in-the-loop are the two patterns that turn a brittle agent into a resilient one. Exception handling is how the agent recovers from failure on its own. Human-in-the-loop, usually shortened to HITL, is how it knows when not to try at all and to hand the decision to a person.

Get both right and the agent degrades gracefully instead of failing silently or, worse, acting confidently on a mistake. The two patterns are the operational backbone underneath the boundaries you set with AI guardrails and safety. Guardrails decide what the agent is allowed to do; exception handling decides what happens when doing it goes wrong.

What Is Exception Handling in Agentic Systems?

Exception handling in an agentic system is the set of mechanisms that let an agent detect, contain, and recover from failures without derailing the whole task. Antonio Gulli, in Agentic Design Patterns, treats it as a first-class reliability concern rather than an afterthought, because agents fail in more ways than ordinary software does. They depend on external tools that go down, on models that occasionally hallucinate, and on inputs that arrive in shapes nobody anticipated.

An agent's task flow meeting a broken step that reroutes through a retry loop and a fallback path, with one branch pausing at a human approval checkpoint — A resilient agent has two responses to trouble: recover on its own through retries and fallbacks, or stop at a checkpoint and hand the decision to a human.

There are two broad failure classes, and they need different defenses. The first is environmental: a tool or API the agent calls fails, times out, or returns garbage. The agent did nothing wrong; the world did. The second is internal: the agent itself produces a bad plan, calls the wrong tool, or generates output that does not match what it intended. The first class you handle with retries, fallbacks, and circuit breakers. The second you handle by giving the agent a way to check its own work, which is where self-correction and reflection come in.

Handling Tool and API Failures

External calls are where agents break most often, because they are the part of the system you do not control. Three patterns, borrowed straight from distributed-systems engineering, do most of the work.

Retries with exponential backoff. When a call fails on something transient, a rate limit, a brief network blip, a momentary 503, the simplest fix is to try again. The trick is not to hammer the failing service. Each retry waits longer than the last, often with a little randomness added so a fleet of agents does not retry in lockstep. A typical schedule waits one second, then two, then four, then gives up. The randomness, called jitter, is what stops a thundering herd from taking down a service that was about to recover.

Fallbacks. Some failures will not clear no matter how many times you retry. A fallback gives the agent a second option: a cheaper model when the primary is overloaded, a cached result when the live API is down, a default answer when a non-critical enrichment call fails. A fallback keeps the task moving in a degraded but useful state rather than letting it collapse because one dependency was unavailable.

Circuit breakers. If a service has failed the last twenty calls in a row, the twenty-first is not going to land either, and retrying it just wastes time and budget. A circuit breaker watches the failure rate. Once it crosses a threshold, the breaker "opens" and fails fast for a cooling-off period instead of letting every call hang for the full timeout. After a while it lets one test call through; if that succeeds, it closes again. This is the difference between one slow dependency and a cascading outage that takes the whole agent down with it.

Code Example (Abbreviated)

Here is the shape of a tool call that retries a transient failure with exponential backoff, then falls back to a cheaper model when retries are exhausted. The detail is trimmed, but the control flow is the point.

# Abbreviated — illustrative retry + fallback, not production code
import time, randomdef call_with_resilience(primary, fallback, args, max_retries=3):
    for attempt in range(max_retries):
        try:
            return primary(args)
        except TransientError:
            if attempt == max_retries - 1:
                break  # retries exhausted, drop to fallback
            backoff = (2  attempt) + random.uniform(0, 1)  # jitter
            time.sleep(backoff)
    # Primary failed every attempt — degrade, don't crash
    return fallback(args)

The same logic lives inside production frameworks. LangGraph lets you attach retry policies to nodes and route to a fallback branch when one fails, so the resilience is part of the graph rather than something you bolt on by hand.

Recovering From the Agent's Own Mistakes

Environmental failures are loud. The call throws, you catch it, you react. The agent's own mistakes are quieter and more dangerous, because a confidently wrong answer does not raise an exception. It just flows downstream looking like a correct one.

The defense is to make the agent inspect its own output before trusting it. A few patterns recur:

Output validation. Before the agent acts on a result, check it against a schema or a set of rules. If the model was asked for JSON and produced prose, that is a caught error, not a runtime crash three steps later.
Self-critique. Have the agent review its own draft against the original goal and flag gaps before continuing. This is the reflection pattern doing double duty as an error-detection layer.
Grounding checks. For anything fact-based, verify the claim against a retrieved source rather than the model's memory. An answer that cannot be grounded is a candidate for escalation, not for shipping.

Enterprise reality: A coding agent that writes a database migration can run that migration against a throwaway copy first, read the result, and only propose the change to a human if it applied cleanly. The agent caught its own mistake before it touched production data. That loop, act in a sandbox, observe, then decide whether to proceed, is worth more than any amount of prompt tuning, because it turns "the model is usually right" into "the model's work is verified before it counts."

Human-in-the-Loop: Knowing When to Stop

Some decisions an agent should not make alone, no matter how good the model is. Human-in-the-loop is the pattern of routing those decisions to a person before the agent acts.

The whole design question is not whether to involve a human, it is where. A human in every loop rebuilds a manual process with extra steps; a human nowhere leaves an unsupervised system making irreversible decisions. The art is placing the checkpoint exactly where it earns its cost.

Three signals tell you a human belongs in the loop:

High cost of error. When a mistake is expensive or irreversible, sending money, deleting records, emailing a customer, merging code to production, the action crosses an approval gate. The agent does all the work up to the decision, then waits for a human to confirm. This is the most common and most important HITL pattern.
Low model confidence. When the agent's own confidence is low, an ambiguous classification, a retrieval that came back thin, a self-critique that flagged a gap, it should escalate rather than guess. Confidence-based escalation keeps the human's attention on the 5% of cases that need it instead of the 95% the agent handles fine.
Ambiguous intent. When the request itself is unclear, the right move is not a confident guess but a clarifying question. An agent that asks "did you mean the May invoice or the March one?" beats an agent that picks one and acts.

The pattern that ties these together is the approval gate: the agent pauses before a high-risk action, surfaces what it is about to do and why, and resumes only on a human's go-ahead. Done well, the human sees a tight summary, the proposed action, the agent's reasoning, and approves or rejects in one click. This is also where HITL connects to multi-agent systems: in a team of agents, the approval gate is often a dedicated supervisor or a human reviewer sitting at the one step where the stakes justify the latency.

An Approval Gate, Abbreviated

The mechanics are simpler than they sound. The agent assembles the risky action, classifies its risk, and only executes directly when the action is safe. Anything above the line waits for a human.

# Abbreviated — approval gate before an irreversible action
def execute_action(action, agent_confidence):
    if action.is_reversible and agent_confidence > 0.85:
        return run(action)                  # low stakes — proceed
    # High cost of error or low confidence — pause for a human
    decision = request_human_approval(
        summary=action.describe(),
        reasoning=action.rationale,
    )
    return run(action) if decision.approved else action.cancel()

Note what the gate is not: it is not a blanket "ask before everything" rule. The reversible, high-confidence path runs untouched. The human is spent only where the cost of error or the lack of confidence justifies the interruption.

Human Feedback as a Recovery and Improvement Path

A human in the loop is not only a brake. The approval or correction a person gives is the highest-quality signal the system will ever get, and throwing it away is a waste. When a reviewer rejects an action or edits the agent's draft, that decision can do two jobs. In the moment, it is a recovery: the bad action never happens, and the corrected version proceeds. Over time, the log of those corrections becomes training data, evaluation cases, and few-shot examples that make the next version of the agent need the human less often.

This is the loop that closes the system. The agent acts, a human corrects the cases that need it, and those corrections feed back into how the agent behaves next time. Capturing them is a job for your observability layer, which is why exception handling and HITL sit right next to monitoring and evaluation: you cannot improve from human feedback you never recorded, and you cannot tell whether your retries and fallbacks are working without the metrics to see them fire.

When to Reach for Each Pattern

These patterns are not free. Retries add latency, circuit breakers add state, and every approval gate trades autonomy for safety. Spend them where they pay off.

Reach for retries and backoff on any external call that can fail transiently, which is nearly all of them. This one is close to mandatory.
Add fallbacks when a degraded answer beats no answer, and skip them when a wrong answer is worse than an error the caller can see.
Wrap a circuit breaker around any dependency an agent calls often enough that a cascading failure is a real risk, and skip the overhead for a one-off call.
Gate with approval checkpoints before irreversible, high-cost, or compliance-sensitive actions, and resist the urge to gate low-stakes steps the agent handles reliably on its own.
Lean on confidence-based escalation when you can get an honest confidence signal out of the agent, and fall back to output validation when you cannot.

The honest test for any of these is whether the failure it guards against is one you can afford to absorb. If a step is cheap, reversible, and reliable, wrapping it in three layers of protection just adds latency and code. If it is expensive, irreversible, or flaky, the protection is the difference between a resilient agent and an incident report.

Key Takeaways

Exception handling lets an agent recover from failure on its own, while human-in-the-loop lets it recognize when it should not try, and a resilient agent needs both.
Handle external tool and API failures with retries and exponential backoff, fallbacks to a degraded-but-useful path, and circuit breakers that fail fast instead of cascading.
Catch the agent's own mistakes with output validation, self-critique, and grounding checks, so a confidently wrong answer gets stopped before it flows downstream.
Put a human in the loop where the cost of error is high, the model's confidence is low, or the intent is ambiguous, and nowhere else. The approval gate is the workhorse pattern.
Human corrections are both an immediate recovery and a long-term improvement signal. Record them, learn from them, and let the agent need the human less over time.

Previous in series

AI Guardrails and Safety: Building Trustworthy Agentic Systems

Next in series

Monitoring AI Agents: Goal Setting, Evaluation, and Prioritization

Is your site invisible to AI search?

Get a free AEO infrastructure audit and find out what your competitors are doing that you're not.

Get Your Free Audit

Industry sources we cite.

3 links · External

Quick answers

Frequently asked.

Keep reading

Continue with.

Agentic AI

AI Guardrails and Safety: Building Trustworthy Agentic Systems

AI guardrails are the input, output, and permission controls that keep an agent safe in production: what separates a demo from an enterprise deployment.

June 15, 2026·11mRead

Agentic AI

Monitoring AI Agents: Goal Setting, Evaluation, and Prioritization

You can't improve an AI agent you can't measure. A practical guide to observability, offline and online evals, the metrics that matter, and what to fix first.

June 15, 2026·12mRead

Agentic AI

Reflection and Adaptation: How AI Agents Learn From Their Own Output

Reflection is the pattern where an AI agent critiques its own output and revises it, looping until the work clears a quality bar. It is the self-correction loop behind reliable agents.

June 15, 2026·10mRead

Exception Handling and Human-in-the-Loop: Making AI Agents Resilient

Why This Matters for Enterprise AI

What Is Exception Handling in Agentic Systems?

Handling Tool and API Failures

Code Example (Abbreviated)

Recovering From the Agent's Own Mistakes

Human-in-the-Loop: Knowing When to Stop

Three signals tell you a human belongs in the loop:

An Approval Gate, Abbreviated

Human Feedback as a Recovery and Improvement Path

When to Reach for Each Pattern

Key Takeaways

Further Reading

Industry sources we cite.

Frequently asked.

What is exception handling in an AI agent?

What is human-in-the-loop (HITL) in AI agents?

Where should you put a human in the loop for an AI agent?

Continue with.

AI Guardrails and Safety: Building Trustworthy Agentic Systems

Monitoring AI Agents: Goal Setting, Evaluation, and Prioritization

Reflection and Adaptation: How AI Agents Learn From Their Own Output