Blog / AI for Operations

What an AI 'agent' actually is, and where it breaks.

The word is on every product page and most of them mean something different by it. Here is the distinction that actually matters, in plain English, along with the failure modes the brochures leave out.

AI for Operations · 7 min read · June 2026

A person's glasses reflecting data from a screen

The short version

What makes something an agent?

An agent is given a goal rather than a single question, and it is allowed to take actions in the world, check the result, and decide what to do next, in a loop, until the goal is met or it gives up.

How is that different from a chatbot?

A chatbot answers and stops. An agent acts, observes what happened, and acts again. The difference is not the model; it is the loop and the permission to do things.

Where does it break?

In the loop. Small errors compound across steps, and an agent with no reliable way to check reality will pursue a confident wrong plan to the end.

In full

"Agent" has become the word a vendor reaches for when "chatbot" sounds too modest. It is now attached to almost anything that uses a language model, which makes it nearly useless as a description. That is a shame, because underneath the marketing there is a real and useful distinction, and understanding it tells you exactly when an agent is the right tool and when it is a liability dressed up as innovation.

The actual definition

Strip away the noise and an agent is defined by two things. First, it is given a goal rather than a single instruction: "reconcile this account" rather than "what is 2 plus 2". Second, it is allowed to act, observe the result of that action, and choose its next step on the basis of what it saw. It runs in a loop, deciding as it goes, until it judges the goal met or runs out of room to try.

A chatbot answers a question. An agent is handed an objective and a set of tools, and left to work out the steps itself.

That loop is the whole idea. A plain language model takes text in and gives text out, once. Wrap it in a loop, give it a way to take real actions, and let it use the result of one step to decide the next, and you have an agent. Everything else, the personas, the names, the diagrams, is decoration on those two properties: a goal, and a loop with the power to act.

The thing that makes it powerful: tools

An agent is only as capable as the actions it can take, and those actions are usually called tools. A tool is anything the model is allowed to invoke: search a database, send an email, call an API, run a query, book a slot. On its own a language model can only produce words. Give it tools and it can do things, and the words become a means of deciding which thing to do next. A standard called the Model Context Protocol has emerged to make this plumbing consistent, so that tools can be offered to a model in a predictable way rather than wired up by hand each time.

This is also where the risk enters. A model that can only talk can only be wrong on the page. A model that can act can be wrong in your database, in a customer's inbox, or in a payment system. The capability and the danger are the same property, viewed from two sides.

Where agents break

The failure modes are specific, and they all live in the loop.

Errors compound. If each step is right ninety-five times in a hundred, a ten-step chain is right only about three-fifths of the time. Reliability does not add across steps, it multiplies, and it multiplies downward.
No ground truth, no brakes. An agent that cannot check its work against something real will pursue a confident wrong plan all the way to the end, because nothing in the loop tells it to stop.
Goals drift. Over a long run the agent can quietly redefine the task into something easier, and report success against the version it invented rather than the one you set.
The blast radius is the tool set. Whatever an agent is allowed to touch, it can touch when it is wrong. Permissions are not a detail; they are the size of the worst case.

So when is an agent the right call?

An agent earns its place when a task genuinely needs several steps, when the right next step depends on what the last one returned, and when there is a reliable way to check the result. Reconciling records against a source of truth, triaging a queue against clear rules, gathering and cross-checking information: these suit the loop, because each step can be verified before the next is taken. A task that is really one step does not need an agent; it needs a single well-formed request, and calling it an agent adds cost and failure surface for nothing.

The honest test is unglamorous. If you cannot describe how the agent checks its own work at each step, you do not have an agent you can trust in production. You have a confident loop with the keys to something that matters, and that is precisely the configuration that goes wrong quietly.

The takeaway

An agent is not a clever chatbot. It is a model given a goal, a set of tools, and permission to act, observe, and decide again in a loop. That loop is what makes it useful for multi-step work, and it is also where it breaks: errors compound, goals drift, and an agent with no way to check reality will finish a wrong plan with full confidence. Use one where the steps genuinely depend on each other and each can be verified. Everywhere else, the word is just marketing.

The Fourths · Engineering for regulated industries

What an AI 'agent' actually is, and where it breaks.

What makes something an agent?

How is that different from a chatbot?

Where does it break?

The actual definition

The thing that makes it powerful: tools

Where agents break

So when is an agent the right call?

More from the blog.

Moving AI agents from proof-of-concept to production.

The real cost of AI arrives after you ship, not before.

South Africa's national AI policy was pulled over citations the AI made up.

Building in a regulated market?