Exploring data with agents¶

So far we’ve been asking specific questions and getting specific answers. But one of the most powerful things about coding agents is that you can point them at a dataset and say “find me something interesting.”

This is surprisingly effective. Agents can scan through columns, run statistical summaries, spot outliers, look for correlations, and surface patterns that you might not think to look for - all in a few minutes.

FEC campaign finance data¶

For this exercise we’ll use Federal Election Commission campaign finance data, hosted in a Datasette Cloud instance at:

https://fec.datasette.cloud/

An invite code and API token will be provided at the workshop.

Setting up dclient¶

We’ll use dclient to let our coding agents run SQL queries against the remote database.

In Codespaces, dclient is already installed. If you’re working locally, install it with:

uv tool install --prerelease allow dclient

Then configure it to connect to the FEC database:

dclient alias add fec https://fec.datasette.cloud/
dclient auth add fec
# paste the token provided at the workshop
dclient default instance fec
dclient default database fec fec

Once configured, you can run SQL queries against the remote database like this:

dclient 'select sql from sqlite_master'

Using Showboat¶

Showboat is a tool that allows agents to build Markdown documents documenting their process. It includes commands for adding notes and images, and also a way to run a command and record both the command and its output. This makes it a useful tool for documenting how an agent explored a complex dataset.

Tell your agent:

Run `uvx showboat --help` to learn how Showboat works.

Open-ended exploration¶

Now tell your agent to explore:

Run `dclient --help` to learn how dclient works.

Then use `dclient` and `showboat` to explore the fec database, making notes as you progress and capturing your dclient commands using `showboat exec` as you go.

Add commentary with `showboat note`.

Let it run. Watch what it does - it should start by figuring out the schema, running summary queries, and then digging deeper into anything that looks unusual.

Push for more¶

Try pointing the agent at something specific and see what it can uncover:

Dig into C00639591 and see what you can figure out about it

(That’s Alexandria Ocasio-Cortez for Congress - but don’t tell the agent that, see if it figures it out.)

You can also steer it toward specific angles:

Which candidates have the most unusual fundraising patterns?

What are the biggest individual donations? Who are the most prolific donors?

Evaluating the results¶

Not everything an agent finds will be genuinely interesting - and some findings might be wrong. As the agent surfaces patterns, ask yourself:

Is the query correct? Run it yourself in Datasette to verify.
Is the interpretation right? The agent might misunderstand what a column means.
Is it actually surprising? Some “findings” are just obvious facts restated.
Could it be a data artifact? A spike in donations might just reflect a reporting deadline.
Would this lead to a story? The best findings raise follow-up questions.

This is the same critical eye you’d apply to any data analysis.