Summary
- The category is a rebrand - tools sold as AI orchestration ship DAGs, retries, durable state, human-in-the-loop gates, and audit logging. Workflow engines shipped that same list by the mid-2010s, and the overlap is not a coincidence.
- What is actually new? The participant. A model now does some steps a person or a script used to do. The orchestration around those steps - sequencing, failure handling, approvals, state - was a solved discipline with mature tools.
- The old guard agrees from both directions - Camunda, a BPMN-first workflow vendor, now sells agentic orchestration, while a commenter on a March 2026 Hacker News thread put it the other way: AI agent orchestration is where the workflow engine shines.
- Buy by problem, not by label - Temporal for durable code, Airflow for data pipelines, Camunda for BPMN estates, Tallyfy for ops-owned business workflows. See where AI fits in a real process: run AI steps inside defined workflows.
Pull up the website of any tool selling “AI orchestration” and read the feature list slowly. Graphs of steps. Retries on failure. State that survives a crash. A human approval before the risky action, and logs of every run.
Now pull up the documentation any workflow engine published a decade ago and find the difference. There isn’t one that matters. The AI orchestration category is workflow orchestration with the nouns swapped, and noticing that will save you from buying the same discipline twice.
None of this means the tools are bad - some are excellent. It means the orchestration problems were solved before the orchestrated thing changed, and the fastest way to make sense of the vendor noise is to see it as one more chapter in the longer arc of AI inside business operations, not a new field. The participant is new. The choreography isn’t.
Business Process Management Made Easy
Read the feature lists side by side
In April 2025, a developer going by Beubax posted a Show HN for Grapheteria, a framework for agent orchestration, and opened with disarming self-awareness: “I know what you’re thinking, ‘Oh no, not ANOTHER agentic workflow library.’” The pitch that followed described two frustrating camps - “Code-only frameworks: Powerful but often buried under layers of abstractions” versus “UI-only builders: Great for simple flows but hit a wall when you need real customization” - and a design philosophy of “clean, composable graphs where each node and edge has a clear purpose.” Out of the box: “human-in-the-loop, step-by-step debugging, and solid logging,” plus the ability to step backward through a run and replay it. Even the thread’s lone question, from a commenter called badmonster, asked how the visual editor and the underlying code stay in sync - a tension workflow tooling has been negotiating for a decade.
Cover the word “agentic” and read that list again.
How many of those features would have surprised a workflow engineer in 2015?
Zero is the honest count.
Nodes and edges with clear purposes, the code-versus-visual-builder tension, human checkpoints, replayable runs, logging you can actually use - that’s the standard feature set of workflow tooling, and it has been for a long time. None of this picks on one project. Grapheteria is one of dozens, and its author was upfront enough to name the fatigue in his own headline. What matters is the pattern: every team that sets out to orchestrate AI agents rediscovers, one production incident at a time, the same requirements the workflow-engine community spent two decades turning into boring infrastructure.
The vocabulary mapping is nearly one-to-one. Agent “memory” is durable state. “Handoffs” between agents are task routing. “Guardrails” are validation gates, and “traces” are audit logs.
“Human-in-the-loop” is an approval step, the feature every workflow approval system was built around. New words, old plumbing - and the new words carry a markup.
Why the convergence was inevitable
Orchestrating AI agents means running long, multi-step work where any step can fail, where state has to survive restarts, where a human must approve the dangerous parts, and where someone later asks what happened on run 4,372. Swap “AI agents” for “microservices,” “ETL jobs,” or “purchase approvals” and the sentence stays true. That problem shape is exactly what workflow engines exist for, which is why the convergence runs in both directions: agent startups keep growing workflow features, and workflow vendors keep absorbing agents as a participant type. Two crowds digging from opposite ends of the same tunnel, meeting in the middle, each a little annoyed the other got there. The requirements were never AI-specific. They’re the requirements of work that matters, running over time, across systems and people who need to trust the result.
What we didn’t see coming, a few years into the AI wave: the vendors sounding newest keep describing the oldest part of our own product. Durable state, approval gates, run history - we’ve shipped those since before transformer was a household word, and so has every serious workflow engine.
The view from the workflow-engine side is worth quoting. On a March 2026 Hacker News thread about Cook, a CLI for orchestrating Claude Code - submitted by staticvar - a commenter called yohamta, posting about the Dagu workflow engine, compressed the whole thesis into three sentences: “AI agent orchestration is future. That’s where workflow engine shines. I’m doing the same thing using Dagu.sh and I don’t use terminal so much anymore.”
The engine didn’t change to deserve that future. The workload arrived.
Look at retries, the least glamorous primitive on the list, to see how little actually changed. A workflow engine retries a failed step with backoff, caps the attempts, and routes to a fallback or a person once the cap hits. Now the failing step is a model call instead of an API call - a timeout, a rate limit, a malformed response. Which part of that retry policy needs reinventing? None of it.
The engine doesn’t care whether the step that failed was deterministic or probabilistic. It cares that attempt two is allowed and attempt five isn’t, and that somebody gets the case when attempts run out. The primitive transfers untouched, and so does nearly every other one on the list.
Meanwhile the most enterprise-coded workflow vendor of them all made the same bet from the other side. Camunda - the BPMN-first process automation company - now sells agentic orchestration as a headline capability, defining it as connecting “agents, humans, and systems into continuous end-to-end flows, with the governance and resilience that mission-critical work demands.” Their model “sequences every participant - deterministic steps, AI agents, human tasks - into a continuous end-to-end flow,” with agents “built natively into the process model - with governance, audit trail, and resilience included,” and the model layer left open: “Orchestrate GPT-4, Gemini, Claude, or your own.” A company that spent years selling process orchestration looked at AI agents and concluded they were a new kind of process participant. Basically nobody who already owned an orchestration layer concluded they needed a second one.
Which engine fits which problem?
If the discipline is one discipline, the buying question gets simpler and more honest: not “which AI orchestration platform” but “which workflow engine matches my problem, and can AI participate in it?” The slots are real and different, and pretending one tool covers all of them is how implementations die.
Temporal calls itself a Durable Execution Platform: “Durable Execution ensures that your application behaves correctly despite adverse conditions by guaranteeing that it will run to completion.” A Temporal Workflow is “your business logic, defined in code, outlining each step in your process,” and when an activity fails, the platform retries it per your configuration - their docs describe it as the ultimate autosave. That’s the slot for engineering-owned, failure-prone, long-running backend work. Apache Airflow describes itself as “a platform created by the community to programmatically author, schedule and monitor workflows” - the default home of data pipelines and ML batch jobs, owned by data engineers. Camunda owns the BPMN slot: if your organization models processes in BPMN and DMN - and their pitch leans on those being “vendor-portable and human-readable” - that’s a real moat for complex, regulated process estates, and it requires people fluent in BPMN to drive it.
And Tallyfy? We take the slot the other three don’t want: business workflows owned by the operations team itself. Employee onboarding, client intake, purchase approvals, contract reviews - the repeatable processes where the person who owns the outcome can’t write code and shouldn’t have to. No DAG definitions in Python, no BPMN modeling tools, no cluster to run. If you’re weighing the heavier end of that spectrum, we keep a direct comparison with Camunda that’s blunt about both directions - heavy BPMN modeling is a thing Tallyfy deliberately doesn’t do.
| One discipline, four slots | |||
|---|---|---|---|
| Built for | Typical work | Owned day to day by | |
| Temporal | Durable execution in code | Long-running, failure-prone backend jobs | Software engineers |
| Apache Airflow | Authoring and scheduling pipelines | Data and ML batch workflows | Data engineers |
| Camunda | BPMN-modeled process estates | Complex, regulated enterprise processes | Process specialists and developers |
| Tallyfy | Ops-owned business workflows | Onboarding, approvals, intake - with AI steps | Operations teams, no code |
Every row already runs AI as a participant, or is racing to. None of them needed to become a different category of software to do it. That’s the tell worth keeping: when the incumbents absorb the new workload without changing shape, the workload was never a new category - turns out it was a new step type.
Treat AI as a step type, not a second stack
Here’s what the rebrand costs you if you take it at face value: a second orchestration layer running next to the one you have. Two places where state lives. Two retry policies that disagree. And a pair of audit trails to reconcile when a regulator or a customer asks what happened, while two on-call rotations each assume the other one saw the alert. Teams that bought a separate “AI orchestration” stack on top of a perfectly good workflow engine end up writing glue between two orchestrators - a kludge that exists only because a label convinced someone the old discipline didn’t apply to the new participant.
The cheaper architecture is one orchestration layer with AI as a participant inside it.
In Tallyfy that participant model is literal. A step in a process gets done by a person, by a rule, or by an AI, and the process treats all three the same - same deadlines, same record of what happened, same real-time status anyone can check without asking around. The AI steps doing real work today are the bounded ones: read a document and extract the fields, classify and route an incoming request, draft an update for a person to approve. Agents reach those steps through our MCP server and its 100+ tools, but the orchestration - the order, the state, the approvals, the audit trail - stays with the process. We’ve watched the alternative fail in a specific way: when the model owns its own control flow, you eventually learn why loops belong to the runtime, not the model, usually in the middle of the night.
That division holds up because of what each side is good at.
Models judge; engines count.
A model can read a contract better than your intake script ever did, and it still can’t be trusted to remember what it was doing twelve steps ago - the engine carries the goal so the model doesn’t have to. Sequencing, retrying, knowing the difference between attempt three and attempt four: that’s bookkeeping, the engine’s whole job, and the reason orchestration concepts from 2010 didn’t expire when the participants got smarter.
Mind you, the agent-SDK layer underneath is its own decision with its own churn problem - we walked through LangGraph, CrewAI, and AutoGen separately - but whichever SDK engineering picks, the business process above it shouldn’t move. The whole stack works precisely when each layer can change without renegotiating the others, which is the fundamentals of workflow automation doing quiet work under a loud market.
One discipline, two decades of names
The biggest lesson a decade of building Tallyfy keeps re-teaching us: the boring layer is the durable one. Categories above it rebrand every few years - BPM became process mining became hyperautomation became, now, AI orchestration - and underneath, the actual work of sequencing steps, holding state, gating risk, and recording everything has barely changed shape. Remember when RPA was going to be its own discipline too? Same play, late 2010s edition: software robots doing the clicking a person used to do, sold as a new category, orchestrated - of course - by a workflow. The participant swaps; the choreography stays. People who ran workflow engines through any one of those cycles already know how to run agents, because the discipline transfers whole - the engines that ran the last cycle are quietly running this one.
So when the next pitch deck says AI orchestration platform, ask the clunky question out loud: what does this do that a workflow engine doesn’t? Sometimes there’s a real answer - smoother model integration, nicer agent debugging - and that answer describes a feature, which you should evaluate as a feature. Buying a feature is fine. Standing up a second orchestration stack to get a feature is how you reinvent the wheel and pay for two of them.
Pick the engine whose slot matches your problem. Let your engineers own the durable-code slot, your data team own the pipeline slot, and your process specialists own the BPMN slot if you have one. And if the workflows in question are the ones your operations team runs every day - onboarding, intake, approvals - define them once, in a system that team can actually read, and add AI one bounded step at a time. The orchestration was never the new part. Doing it well was always the rare part.