Agent Orchestration Platform

Draft. Site under construction. Case study describes a real shipped project, written generically.

What the business needed

A B2B platform in a regulated industry needed software to automate the highest-volume parts of its core operational workflow — work that was rule-based at the surface but full of judgment underneath. Every input was different. Regulatory requirements applied unevenly. The cost of getting it wrong was real. A team of 20+ operators was spending most of the day on the routine parts of the job, leaving the parts that genuinely required their judgment underserved.

The brief was an orchestration platform: software where the operators themselves define and run AI agents that handle the routine work, surface the cases that need a human, and integrate with the live business systems already in place. Not a chatbot bolted onto a sidebar. A workflow engine the operators own.

What I built

A multi-agent orchestration system. Operators define agents in the UI — tools, prompts, datasets, model selection. The agents run as long-lived workflows over a job queue, calling tools, reading and writing live business data, and pausing for human review when the work crosses a defined threshold. Multiple model providers sit behind a single interface; runs can be replayed, branched, audited.

The platform is not a demo. It is in daily production use. The 20+ operators who depend on it actively shape it — they create new agents, tune existing ones, and own their own workflows. The job of the platform is to be reliable enough that an operator with no engineering background can build something that holds up under real volume.

Solo build, end-to-end: backend, queue, agent runtime, model integration, frontend, observability. Operating it in production is part of the engagement.

Stack / Approach

TypeScript end to end. NestJS for the API surface. BullMQ on Redis for durable agent execution. Cloud Run for compute. Multiple AI model providers behind a single agent runtime so the right model can be picked per task without rewiring the platform. Observability built in from day one — every agent run is inspectable; nothing happens silently.

Trade-off worth naming: the platform is built more like infrastructure than like a product. It exposes affordances rather than hard-coded workflows, because the work it has to support is not knowable up front. The cost is a steeper first day for new operators. The benefit is that the platform survives every change to how the business actually does its job.

Outcome / What it taught me

Operating in production with real users reshapes how you think about AI software. The hard problems are not the model calls. They are durable execution, idempotent retries, sane defaults, readable logs, and a UI that lets non-engineers see what their agents are actually doing. Models change every quarter. The orchestration around them is what stays.

This is also the case study most relevant to the work I do at smaller scale. The same patterns that make a 20-operator platform reliable are the patterns I bring to a one-person workflow problem at much smaller scale. Most small businesses do not need a 20-operator platform. They need the same operational discipline, applied to their specific workflow.