# An Agent Is A Loop

> **Author**: Tuomo Penttinen
> **Status**: Published
> **Published**: 2026-05-10
> **Venue**: hexetiq.com/articles
> **Canonical URL**: https://www.hexetiq.com/articles/how-agent-loops-work/
> **OG Image**: /og-image.png
> **Tags**: ai-first, agentic-coding, agent-mechanics

An agent becomes understandable once you stop looking for a special kind of model and look at the loop: session, model response, tool call, tool result, next turn.

The loop gives the agent a concrete shape. The model reads the current session and returns one new message. The harness runs any requested tool calls and appends the results. Then the model is called again with a session that now contains more evidence than before.

The harness is the app driving the loop, like Claude Code or Codex CLI. It defines the tools the model can call, and the rules they run under.

Planning, subagents, memory, approvals, MCP servers, editor integrations, and terminal commands all elaborate on the same basic loop.

**An agent is not a different kind of model. It is a model invocation inside an execution loop.**

## The session is the state

A model call is not a persistent process that remembers the last turn by itself. Each call receives input and returns output. The state lives in the **session** the harness sends back on every turn.

That session carries the working set: system rules, project instructions, the user's task, prior messages, tool definitions, tool calls, and tool results. If a piece of information is not in that working set, the model does not have it for the next decision.

This is why context management matters in agent work. A stale project instruction can steer the whole run in the wrong direction. A missing error message can make the model retry a fix blindly. A long terminal log can bury the one line that mattered.

The session is not storage. It is the active working memory of the loop.

## Messages move the work

An agent session is an ordered message log. Four roles do most of the work.

The `system` message carries product rules, policy, protocol instructions, and loaded project guidance. The `user` message carries the task, corrections, approvals, and mid-flight steering. The `assistant` message is produced by the model. It may contain text for the developer and structured requests to use tools. The `tool` message is produced by the harness after executing one of those requests.

Every turn appends one new assistant message. If that assistant message asks for tools, the harness appends tool result messages. Then the whole log is sent back to the model.

**The model writes messages. The harness owns the log.**

That distinction is not cosmetic. If the log is assembled badly, the model sees the wrong task. If tool results are not linked back to the right tool call, the model cannot reliably connect request and evidence. If project instructions are not loaded consistently, the agent behaves like a different worker in every repository.

## Tools are descriptions

A tool is a name, a description, and an input shape.

For a file-reading tool, the model might see something like this:

```json
{
  "name": "read_file",
  "description": "Read a file from the project. Returns full contents.",
  "input_schema": {
    "type": "object",
    "properties": {
      "path": { "type": "string", "description": "Absolute path." }
    },
    "required": ["path"]
  }
}
```

The model does not see the implementation. It sees the interface. It chooses the tool by name and emits inputs that match the schema. The harness owns what happens after that.

This is why tool descriptions deserve engineering attention. If the description is vague, the model has vague instructions. If the schema accepts impossible combinations, the model can request impossible actions. If two tools overlap without a clear distinction, the model has to guess which one is meant for the situation.

The tool description and schema are the interface. Treat them like an API.

## The loop is small

Once the session and tools exist, the loop is simple.

```python
messages = [system, user_task]

while True:
    response = model.invoke(messages, tools=tool_defs)
    messages.append(response)

    if response.stop_reason == "end_turn":
        break  # no tool calls accompany end_turn

    for call in response.tool_calls:
        result = harness.execute(call)
        messages.append(tool_result(call.id, result))
```

That sketch hides product detail, not the core idea. Real harnesses stream tokens, preserve transcripts, batch tool calls, ask for approval, enforce sandbox rules, compact context, dispatch subagents, and recover from failures. The shape still holds.

The model receives the current session. It returns the next assistant message. If that message asks for tools, the harness executes them and appends results. The next model call sees the results and chooses what to do next.

The agent loop reduces to three operations:

1. Bundle the session.
2. Ask the model what should happen next.
3. Run the requested tool and add the result back.

The third feeds the first. That feedback is the difference between a chat answer and delegated work.

## The harness is the boundary

The model does not touch your filesystem. It does not run shell commands. It does not call your internal APIs. It emits text and structured tool calls.

The **harness** decides what those tool calls can reach. It reads files, applies edits, runs commands, calls APIs, enforces approval rules, limits the sandbox, records the transcript, and feeds results back into the session.

This is the trust boundary. If the harness does not expose a delete tool, the agent cannot delete through that interface. If the harness requires approval before shell commands, the agent can request a command, but it cannot run it silently. If the harness clips terminal output, the model sees the clipped result, not the full log.

This also changes the question teams should ask when evaluating an agent. "How smart is the model?" is not enough. The operational question is: what verbs has the harness exposed, under which rules, with which evidence returned to the next turn?

Tools are what the agent can do. The harness decides what the agent does.

## Stop conditions are part of the design

A clean demo ends with `end_turn`: the model has no more tool calls and reports the task done. Real work needs more exits than that.

The harness may stop because the session no longer fits in the context window. It may stop because the run hit a maximum number of tool calls, a wall-clock limit, or a cost budget. The user may interrupt. The harness may detect a loop when the same failing action repeats without new evidence. A tool may return a fatal state, such as missing credentials or a read-only filesystem.

Those exits are not implementation details. They are part of what makes the loop safe to delegate into.

Without a stop condition, an agent can retry a failing test until someone kills the process. Without a token or output budget, one noisy command can consume the active context. Without human interrupt, the developer cannot steer when the task has drifted.

Termination is not failure handling bolted on later. It is a harness responsibility, not a model property.

## What this changes

The loop makes agents less mysterious and more inspectable.

You can read the run end to end: the session contents, the tool definitions, the assistant messages and tool results, the approval points, and the stop condition that fired.

The mechanical model gives teams something to inspect before they argue about tool brands.

If a tool cannot show the message log, it is harder to debug. If it cannot explain its approval model, it is harder to trust. If it cannot preserve enough context, it will lose the task. If it cannot stop cleanly, it should not be allowed near risky work.

The [first article in this series](https://www.hexetiq.com/articles/five-eras-of-ai-assisted-coding/) framed the shift from suggestions to delegated work.

Once the loop is visible, the next question is the session itself: the agent can only work with the material the session carries.

## References

- OpenAI, [Function calling](https://developers.openai.com/api/docs/guides/function-calling)
- Anthropic, [Tool use with Claude](https://platform.claude.com/docs/en/agents-and-tools/tool-use/overview)
- OpenAI, [Codex CLI](https://developers.openai.com/codex/cli)
- OpenAI, [Codex best practices](https://developers.openai.com/codex/learn/best-practices)
- Anthropic, [How Claude remembers your project](https://code.claude.com/docs/en/memory)
