Coding agents¶

What’s an agent?¶

The definition of the term “agent” in AI has been hotly debated since at least the 1990s. For this workshop, we’ll use my preferred simple definition:

An LLM agent runs tools in a loop to achieve a goal.

The agent harness is the software that sends a prompt to a Large Language Model, checks if the model wants to use a tool, executes that tool, feeds the result back to the model, and repeats. That’s the loop.

A tool can be almost anything - a calculator, a web browser, a database query, an entire programming language.

The goal is whatever you tell the agent to achieve through prompting.

What’s a coding agent?¶

LLMs have been surprisingly good at writing code since GPT-3 in 2022. This makes sense when you consider that the grammatical rules of Python or JavaScript are trivial compared to human languages like Chinese and English.

They’re also notoriously prone to making mistakes. A mistake in code is bad, it means the code won’t run or, if it does, won’t do the thing you want it to do.

A coding agent solves this by being able to execute the code it writes. It can write some code, run it, see an error, fix the error, run it again, and inspect the output - all in a loop, iterating until it works.

This characteristic is transformational. It upgrades LLMs from weird fuzzy text prediction machines to systems that can almost-reliably write code and solve problems.

Anthropic and OpenAI spent most of 2025 focusing on improving the coding ability of their models. In November 2025 the release of Claude Opus 4.5 and GPT-5.1 represented a significant leap forward in coding ability, especially when the models are run within Claude Code or OpenAI’s Codex agents.

ChatGPT and Claude as coding agents¶

You don’t actually need Claude Code or Codex to explore the power of coding agents. The ChatGPT and Claude consumer apps both have built-in code interpreter features that can write and execute code during a conversation, in Python and other languages.

You can upload data files such as CSVs or Excel spreadsheets and ask questions about them, and the chatbot will write and then execute code to analyze the data and show you the results.

ChatGPT can also download data files directly from web search into its sandboxed container, so you can ask it to find data for you.

Let’s try that now. This should even work with the free tier of both services.

The tools we’ll use¶

In this workshop we’ll primarily use two terminal-based coding agents:

Claude Code from Anthropic - run it by typing claude in your terminal
Codex from OpenAI - run it by typing codex in your terminal

I’ll be using these via GitHub Codespaces. You can run them on your own machine if you prefer.