Vibe-coding the app is easy. Running it is the hard part.

Summary

Generating the tool is the cheap part now - describe what you want, get working code in minutes. The unglamorous part nobody puts in a demo is keeping that tool useful after the first week.
A tool with no process around it has no answers - nothing tells it when to run, who owns the output, or what “done” means, so it drifts the first time the input looks weird (one shape I have watched: a confident, wrong count pulled from a slice of the data nobody noticed was a slice).
The fix is a substrate, not a better prompt - an a16z partner’s line that AI output “is not enterprise software” lands right here. The workflow holds the sequence, the state, the owner, and the record. The generated tool is one step inside it.
Vibe coding still wins for throwaway work - one-off scripts, personal hacks, the thing you run once and delete. Scaffolding is for tools a business leans on. Try Tallyfy free

You can build the app in an afternoon now. You cannot run it in a business in an afternoon, and that gap is where most of the trouble lives.

Here’s the thing the demos skip. The hard part of a business tool was never the code. It was everything around the code: when it should fire, who gets the result, what happens to that result next, and how you know it actually worked. Vibe coding collapsed the cost of the code to near zero and left the rest exactly as expensive as it always was. So you end up with a tool that runs beautifully on the example from the demo and falls over the moment a real person feeds it something the demo never imagined.

This is a different argument from the one I made about vibe coding killing the integration marketplace. That post was about the logic between apps, and why drag-and-drop connectors are done. This one is about the app you just generated for yourself, and why it needs a defined process around it to survive contact with a real team. It’s the same AI and the future of work question, seen from the builder’s side of the desk.

Why a generated tool drifts on its own

A process answers four questions that a generated tool, on its own, cannot: when does this run, who owns what comes out, what happens to the output next, and what counts as done. Skip those and the tool doesn’t fail loudly. It does something worse. It keeps going, confidently, on whatever assumptions got baked into it the day you described it.

That’s the real shape of the problem. People expect AI tools to crash when they’re wrong. Turns out, they mostly don’t. They answer. They answer from whatever slice of the world they can see, and they present that answer with the same calm confidence whether they saw everything or almost nothing.

Picture a small tool you’d actually vibe-code: something that watches a shared inbox and flags the urgent messages for your ops team. The demo is flawless.

Then the real questions show up. When does it run, every minute or once an hour, and who decided? When it flags something, who owns acting on it, or does the flag just sit there glowing? What happens to a message it flags wrong, and who notices? And what does done mean here, an email marked read or an email actually handled? The generated code answered none of that, because you never asked, and it can’t ask on its own. So it runs on the defaults it quietly inferred, and those defaults were never the point.

The logic getting cheap is real, and I’m not walking it back. What got cheap was writing the code. What stayed expensive is the part that decides whether the code helps or quietly wrecks a Tuesday. A workflow platform owns that second part. That’s the whole trade in mega trend two, stated without the slogan: generation is now the easy half, and the context that keeps generation dependable is the half a process has to carry.

A vibe-coded tool needs context, credentials, and bounded edges before it is safe to depend on in a real business

What breaks when nobody defined done?

Let me give you the failure shape that taught me this, with the specifics filed off because the specifics don’t matter and the pattern does.

A tool was asked a simple counting question. It answered with a small, confident number. The number was wrong, badly wrong, because the tool answered from the slice of data it could reach in one pass instead of the whole set, and nothing in the setup ever defined that “all of it” meant all of it. No step said: gather everything first, then count. No owner reviewed the count against reality before it traveled. The answer looked done, so it was treated as done, and a decision got made on a number that was off by a wide margin.

The model didn’t malfunction. It did exactly what it was told, which was nothing in particular, because no process had ever pinned down what a finished answer required. That’s what “no definition of done” costs you. Not a crash. A wrong answer wearing a confident face.

This is also why a single clever prompt never fixes it. A prompt shapes one call. A business runs on the next call, and the one after that, each handed to a different person or tool on a different day. The thing that has to stay constant across all of them is the definition of the work, and that doesn’t live in a prompt. It lives in a process.

A workflow is the substrate a prompt can’t be

When an a16z partner argued that the whole “we will vibe-code everything” story is oversold, the sharpest line in the Hacker News thread that chewed on it came from a commenter pushing back on the hype: AI “makes it easier to create something, but that thing is not enterprise software with support contracts and conformance to mandatory regulations and 4 hour bug turnarounds and real people on the end of the phone who understand how it works.” That’s the gap in one sentence. Generation gives you a thing. A business needs a thing with a process around it.

Notice what’s in that list: support contracts, regulations, bug turnarounds, real people who understand the system. Every one is a process concern, not a code concern. Regulated teams feel it first, because an auditor will ask who approved an output and where the record is, and “the AI did it” isn’t an answer that survives that meeting. But it’s not only them. Any team that leans on a tool eventually needs to know why it did what it did, and a bare generated script keeps no minutes.

That process is what a workflow platform supplies. The workflow owns the sequence, so the tool runs at the right point and not at random. It owns the live state, so there’s one place that knows where the work is. It owns assignment, so every output has a name attached and a wrong answer becomes someone’s task instead of nobody’s mystery. And it owns the record, so when you ask “did this actually work,” there’s an answer that isn’t a shrug. The generated tool stops being the whole system and becomes one well-defined step inside a system that was already keeping score.

Solution Process

Process Documentation Software

Tallyfy is the only product available that does Process Documentation and Process Tracking in one

Save Time

Track & Delegate Processes

Consistency

Explore this solution

I build workflow software for a living, so take the bias as disclosed. But the pattern I keep running into isn’t bad generated code. It’s good generated code with nothing around it. The teams that get durable value out of these tools aren’t the ones with the best prompts. They’re the ones who wrote down, before generating anything, what the tool is supposed to accomplish and how they’ll know it did.

My own blunt version of this is that technology is only a small slice of where AI value comes from, maybe a fifth of it, for what it’s worth. The rest is the process and the people. Vibe coding made the small slice nearly free. It didn’t touch the other four fifths.

There’s a worse version of this that I run into more than the tidy one. Most teams reaching for a vibe-coded tool don’t have a defined process for it to slot into. It’s greenfield: email, a few spreadsheets, and a shared understanding that lives in three people’s heads. Drop a generated tool into that mess and you haven’t added structure, you’ve automated the absence of it. The tool runs fast, and so does the chaos. The process has to exist before the tool is worth building, which is the unglamorous order nobody wants to hear.

Where vibe coding still wins

I want to be straight about when none of this applies, because a post that only sells scaffolding is its own kind of dishonest.

If you’re writing a script you’ll run once and throw away, vibe-code it and move on. A personal hack that touches no one else’s work and no real data needs none of this. A weekend project, a quick reformat, a thing you’d have done by hand in a spreadsheet anyway: ship it from a sentence and don’t give the process a second thought. The cost of scaffolding only pays for itself when other people depend on the output, when the input keeps changing, and when a quietly wrong answer would actually hurt. That’s the line. Below it, the bare generated tool is the right call and the smart move. Above it, the bare tool is a liability with a friendly interface, and the maintenance bill shows up on a delay you didn’t budget for.

So before you generate the next internal tool, ask the boring questions first. When does this run? Who owns what comes out? What happens to it next? What does done mean?

If you can answer those, the generated code is the easy last step. If you can’t, no amount of prompting will save you, because the thing you’re missing was never code. Write the process down, then let AI build the steps inside it. If you want a place to put that process so it actually holds, start free and document one before you automate it.

Vibe-coding the app is easy. Running it is the hard part.

Vibe-coding the app is easy. Running it is the hard part.

Summary

Why a generated tool drifts on its own

What breaks when nobody defined done?

A workflow is the substrate a prompt can’t be

Where vibe coding still wins

About the author

Automate your workflows with Tallyfy