Explainer

What is an
AI Agent?

No magic, no mystery. Just a loop, some tools, and a language model doing its thing.

Sources: Anthropic · Ethan Mollick · Simon Willison · Thorsten Ball · Andrew Ng · Fly.io · Geoffrey Huntley · hoeem

The Big Picture

How we interact with AI has changed

→

2022-2024

🧮

Calculator

You push buttons (prompts), it gives answers

→

2024-2025

🧑‍💼

Intern

You give it a task, it asks clarifying questions

2025+

🏢

Manager

You delegate a job, it figures out the steps and reports back

"We are at the point where we need to think of AI as something we manage, not just something we use." - Ethan Mollick, Wharton

The Simple Truth

An agent is just three things

LLM

+

Loop

+

Tools

Or as Ethan Mollick puts it: "an AI that is given a goal and can pursue that goal autonomously." A working agent can be built in less than 400 lines of code.

"Agents are typically just LLMs using tools based on environmental feedback in a loop." - Anthropic, Building Effective Agents

How It Works

The agent loop

The loop keeps running until the model decides it has nothing left to do. That's the whole trick.

while not done:
  response = call_llm(messages, tools)

  if response.is_final:
    return response       # done!

  for tool in response.tool_calls:
    result = execute(tool)
    messages.append(result) # loop

Key Insight

The model doesn't actually call anything

This is where most people get confused. The model can't run code. It just outputs text describing what it wants to do. The runtime does the rest.

1

Model outputs a request

"I'd like to call get_weather with city: Berkeley"

2

Runtime executes it for real

Your code makes the actual API call, reads the file, runs the query

3

Result goes back to the model

The model reasons over it and decides: done, or call another tool?

Think of it like a "wink." You tell the model: "wink if you want me to raise my arm." When it winks (requests a tool), your code does the actual arm-raising. - Thorsten Ball

Building Blocks

Tools are just descriptions

You don't program when to use a tool. You describe what tools exist, and the model decides:

{
  "name": "get_weather",
  "description": "Get current weather for a city",
  "input_schema": {
    "city": { "type": "string" },
    "units": { "type": "string" }
  }
}

Every coding agent (Claude Code, Cursor, Copilot) is built from just five tools: Read files, List directories, run Bash commands, Edit files, Search code.

"Think about how much effort goes into human-computer interaction, and plan to invest just as much in Agent-Computer Interface (ACI)." - Anthropic

Anthropic's Framework

Workflows vs. Agents

📋 Workflows

Predefined code paths
LLM follows a set sequence
Predictable, repeatable
E.g. "Summarize this, then email it"
Best for: routine, well-defined tasks

🤖 Agents

Dynamic decision-making
LLM chooses its own path
Adaptive, self-correcting
E.g. "Figure out why sales dropped"
Best for: open-ended, complex tasks

"The most successful implementations weren't using complex frameworks. They were building with simple, composable patterns." - Anthropic

The Hard Part

Context is everything

LLMs have zero memory between calls. They're stateless. Every request replays the full conversation from scratch.
Context windows have hard limits. ~1M tokens for Claude and Gemini, ~128k for GPT-4.1. That's the max the model can "see" at once.
More context doesn't mean better. Stuffing the window actually hurts performance. Less is more.

Context engineering (deciding what goes in the window) is the real skill. Not prompt magic. Not model size. What you include and exclude. - Annie Ruygt, Fly.io

Andrew Ng's Framework

Four agentic patterns

🪞

Reflection

The AI critiques its own output and iterates until it's good enough

🔧

Tool Use

Connect to APIs, databases, files, and external services

🗺️

Planning

Break a complex task into steps and execute them in sequence

👥

Multi-Agent

Multiple specialized agents coordinate and hand off work

Model Choice Matters

Not all LLMs are agentic

🧠 The Thinkers

Examples: OpenAI o3, Claude Opus 4, Gemini 2.5 Pro
Deep thinking, complex analysis
Prefer long, thoughtful responses
Less inclined to use tools
Best for: research, math, strategy

⚡ The Doers

Examples: Claude Haiku, o4-mini, Gemini Flash, MiniMax
Biased toward taking action
Quick, cheap, iterative tool calls
Comfortable with trial and error
Best for: coding loops, automation, workflows

Thinkers sit and reason. Doers grab tools and get to work. The best agent systems use both — a Thinker to plan, Doers to execute.

Proof It Works

Agents in production, right now

Klarna

AI agent handles checkout support, refunds, billing

2.3M conversations/mo handled by AI. $40M annual savings.

Wells Fargo

"Fargo" agent handles transactions, disputes, account changes

245M autonomous interactions. Zero PII leaks.

Shopify

CEO mandated AI for all teams; coding agents ship features

CEO commits: 94 (2024) → 833 (2025). AI-first hiring policy.

Allianz Australia

7-agent system processes food spoilage insurance claims

Claims: days → under 1 day. 80% faster. 10% better fraud detection.

StrongDM's rule: if you're not spending $1,000/engineer/day on AI tokens, your factory has room to improve. The leverage is real.

Real-World Case Study

The Software Factory

StrongDM, a security company, built a 3-person team where AI agents write, test, and ship production software. Their rules:

"Code must not be written by humans."

"Code must not be reviewed by humans."

1

Coding agents build the features

Working from specs, not prompts. Full autonomy over implementation.

2

Testing agents simulate real customers

Separate agents find bugs the coding agents can't see or game.

3

Humans review the final product, not the code

Each engineer spends ~$1,000/day on AI tokens. Still cheaper than hiring.

Covered by Simon Willison and Ethan Mollick as proof that agents can now compound correctness rather than compound errors.

What Changes For You

Your role is shifting

→

Before

⌨️

Prompting

Carefully craft the perfect prompt, copy-paste the output

→

Now

📋

Managing

Delegate tasks, review results, course-correct when needed

Overseeing

Set goals, define guardrails, let agents figure out the how

"AI is most useful when it just does stuff. Not when it tells you what to do." - Ethan Mollick, Wharton

So What?

Why this matters

An AI that does things changes everything. As Mollick puts it: "It just does stuff."
Workflow beats model size. A smaller model in an agentic loop can outperform a bigger model used once
Start simple. Anthropic's #1 advice: single LLM calls first, add complexity only when needed
Accessibility. A working agent takes ~300 lines of code and ~30 minutes

"Get on this bike and push the pedals." - Annie Ruygt, Fly.io

Go Deeper

Resources

That's it.

An agent is an LLM with tools and a loop. The secret everyone wants explained is embarrassingly simple.

What is anAI Agent?

The Big Picture

How we interact with AI has changed

The Simple Truth

An agent is just three things

How It Works

The agent loop

Key Insight

The model doesn't actually call anything

Model outputs a request

Runtime executes it for real

Result goes back to the model

Building Blocks

Tools are just descriptions

Anthropic's Framework

Workflows vs. Agents

📋 Workflows

🤖 Agents

The Hard Part

Context is everything

Andrew Ng's Framework

Four agentic patterns

Reflection

Tool Use

Planning

Multi-Agent

Model Choice Matters

Not all LLMs are agentic

🧠 The Thinkers

⚡ The Doers

Proof It Works

Agents in production, right now

Real-World Case Study

The Software Factory

Coding agents build the features

Testing agents simulate real customers

Humans review the final product, not the code

What Changes For You

Your role is shifting

So What?

Why this matters

Go Deeper

Resources

That's it.

What is an
AI Agent?