Zo Blog

Substrate, Agents, Workflow

Principles for building with AI

by Ben

Everything about creation is changing… but this isn’t some new thing caused by AI. Technology has been lowering barriers to creation and distribution for centuries. AI coding is just like the typewriter, or the high-level programming language, or the bedroom recording studio… New tools enable new creators to reach new heights.

But creating something great requires more than just tools:

  • Taste: You have good instincts, prioritize correctly, etc.
  • Skill: Your iterations always improve, and rarely regress.
  • Experience: You have a lot of ideas and navigate them well.
  • Energy: You iterate quickly and don't get tired or overwhelmed.

I've been honing these skills for two decades, creating music and building software. And I know from experience that the tool – the instrument, the software, the hardware – always matters far less than the medium and the message.

It boils down to training, and then harnessing, all the taste, skill, experience, and energy of your ancestors, to create the best messages you possibly can, using whatever tools you have at your disposal.

Interestingly, this formula is also what agentic AI today boils down to (next token prediction with tool calling)…

Coding with AI in 2025

Everything is changing so quickly that I fully expect to be trying different tools every month for the foreseeable future, until the dust finally settles on AI coding, which could take years. And it will take far longer than that for the dust to settle on programming... These are the tools I'm using today:

Beyond the specific tools, a big part of using AI effectively is understanding how the underlying system works. How coding agents work, in a nutshell:

The AI coding system
The AI coding system

An LLM does most of the driving in my day-to-day coding these days. Early on, I spent a lot of time tuning my prompts to adjust the style and behavior of my daily driver. Prompt engineering is more art than science – but anyone who's really played with it knows that good prompts really do elicit noticeably different behavior.

I use two high-level prompts – one for overall behavior (my "system prompt"), and an additional prompt for writing plans.

Prompting is just a strange bag of tricks. Here's my favorite trick (a friend found this on Reddit, and it's strangely effective):

根回し - Nemawashi Phase - Deep Understanding. Purpose: - Create space (間, ma) for understanding to emerge - Lay careful groundwork for all that follows - Achieve complete understanding (grokking) of the true need - Unpack complexity (desenrascar) without rushing to solutions. Expected Behaviors: - Show determination (sisu) in questioning assumptions - Practice careful attention to context (taarof) - Hold space for ambiguity until clarity emerges - Work to achieve intuitive grasp (aperçu) of core issues. Core Questions: - What do we mean by key terms? - What explicit and implicit needs exist? - Who are the stakeholders? - What defines success? - What constraints exist? - What contextual factors matter? Understanding is Complete When: - Core terms are clearly defined - Explicit and implicit needs are surfaced - Scope is well-bounded - Success criteria are clear - Stakeholders are identified - Achieve aperçu - intuitive grasp of essence Return to Understanding When: - New assumptions surface - Implicit needs emerge - Context shifts - Understanding feels incomplete

Substrate, Agents, Workflow

In the future, when the dust settles on working with AI, we'll find that it boils down to:

  • The substrate, with a healthy immune system
  • The agents, drivers for planning and doing
  • The workflow, always deeply personal

This essay is about coding with AI in 2025. But I believe this framework is eternal, and applies to any creative pursuit in collaboration with AI.

With Zo Computer, we're building A General Interface for working with AI.

The AI coding iteration loop
The AI coding iteration loop

The substrate, with a healthy immune system

Today, people generally report more success coding with AI on greenfield projects. This isn't just because greenfield projects are smaller. It's because projects that grow with AI from the outset naturally build a stronger immune system for receiving AI code.

The immune system for AI contributions
The immune system for AI contributions

Part of this AI immune system is old: comments, tests, compilers, linters, code review, observability, and continuous automation (with a blend of humans and machines in the loop).

Just as mature codebases build stronger automated defenses as they deploy more frequently and add more contributors, healthy AI-first codebases must build strong automated defenses because the volume of output per individual contributor is unusually high.

The rest of the AI immune system is brand new. To build a healthy AI-first codebase in 2025, you’ll need:

  • Rules that teach the AI to compile directions to code properly:
    • Cursor rules, CLAUDE.md, and the like are our new style guides and internal docs. Just like humans, AI agents only loosely follow these things. But they’re still important.
  • An organization that teaches humans to direct AI properly:
    • Excelling with AI requires a mindset shift. It's not about being an individual contributor. It's not even about being a manager, or an architect, or a designer. It’s about being an organization. The great organizations of history have always been incredibly well-oiled machines: cyborg hive-minds, the perfect union of automation and precisely directed human activity.
    • The great organizations of the future will realize there’s a cyborg-hive-mind within each of us, waiting to be born.

Coding with AI is super early – which means you should actively seek out improvements to your setup, and keep a pulse on your velocity and code quality. For reference, on a new but maturing codebase, I averaged 7.5k lines of agent edits a day in Cursor last month. That's with me shifting a lot of usage to Claude Code – my numbers were twice as high the month before.

My Cursor usage
My Cursor usage

The agents, drivers for planning and doing

Style becomes increasingly important as agents become more autonomous. It’s already happening, but soon aesthetic personality will become the primary factor when choosing an agent.

Today, agent choice is about optimizing for two distinct modes of work:

  • Planning (what, why, how)

    • For this agent, I’m evaluating intelligence and depth. I’m looking at benchmarks like EnigmaEval and MultiChallenge. Here, the LLM matters more than the agent harness (I like Cursor, but any old harness will do). o3 is clearly the best model in my book: terse in writing style, exhaustive in context-seeking, and methodical in reasoning.
Planning with AI
Planning with AI
  • Doing (compiling the plan into code)

    • For this, I’m evaluating speed and style. I like to route to different agents depending on the task:
      • Claude Code is my primary driver because it feels the most autonomous–capable of following long plans all the way to the end. But the coding style of Sonnet can be messy and overeager, and the lack of an editor can feel limiting. To deal with the lack of an editor, I review code being generated in Github Desktop, sometimes take a closer look or add commentary in Zed, and then do a final review in Graphite (I like their pull request UX, but I don’t believe in AI code review*)
      • I often switch back to Cursor and o3 for more complex implementation tasks–where I want it to feel more like pairing, and less like delegation.

*because false positive AI fatigue is the new alert fatigue.

Here's a snippet from my planning prompt:

When asked to write a plan, examine the existing plan file for any existing material (it may contain an outline or notes from the user, or a previous revision of the plan). Use the existing material as the foundation for your proposal. Describe the specific code changes required concisely, with minimal surrounding prose. Design the code changes so that they can be implemented incrementally. Do not break up the changes too much: there should generally be two or three phases of work that logically go together, and can be stacked on each other. Any new or changed interfaces in your planned code should be well-typed, self-documenting, and self-consistent with surrounding code. If there are existing naming conventions, follow them. Method, parameter, and field names should be terse-but-descriptive, idiomatic to the language, and consistent with naming conventions in surrounding code. If abstractions need to be adjusted for a clean, self-consistent end result, include the refactoring required as an independent phase before the implementation that uses these abstractions. In each phase, look out for complex logic that has been added, and describe how it can be unit tested (describe the specific unit tests), searching for existing tests that can be updated, or describing new test files as needed. The unit tests should be grouped with the relevant phases to help us incrementally check our work.

A typical plan
A typical plan

The workflow, always deeply personal

You are not your tools. But you are the way you use your tools.

The great creators of history have always accomplished more than their peers using the same tools. Some people were just born to shred.

But many people learn to shred. You can learn to shred with AI, and you’ll eventually develop your own distinct style.

As you learn to build with AI, you’ll find the bottleneck is still your mortal human limits:

  • Limited time
  • Limited mental capacity

But with AI, you can expand your time. Here’s an easy way: work on two tasks in parallel. You can just use a single git branch, or try git worktree if you’re feeling fancy. It’s like braiding: just switch between two parallel chat threads until both tasks are done. If you select related tasks, you’ll be able to review both of them in a batch, and save additional context-switching overhead. Once you feel comfortable with two tasks, try braiding three…

The multi-agent AI coding workflow
The multi-agent AI coding workflow

Some say coding with AI leads to skill atrophy. This may be true: I’m noticeably worse now at typing code by hand. But I’ve also experienced something new and surprising: working with AI has expanded my mental capacity in strange and exciting ways.

Keep going. Collaborative creation with AI is going to be really great – I can already feel it.