Amit Kothari
Amit Kothari CEO of Tallyfy · Workflow AI Expert

Healthcare AI workflows - small errors, big consequences

In brief

Insurers on HealthCare.gov denied 19% of in-network claims in 2024, per KFF, and the founders of a chart-auditing startup say a mistyped medication time can trigger an automatic denial. Healthcare is the clearest case for AI inside workflows: suggest, flag, pre-fill - and a clinician signs every entry.

Summary

  • Claim denial is routine, and paperwork feeds it - KFF found HealthCare.gov insurers denied 19% of in-network claims in 2024, ranging from 3% to 36% by insurer. Fewer than 1% of denials were appealed, and 66% of those appeals lost.
  • The errors are small and the consequences are not - the founders of WorkDone, a YC-backed startup auditing medical charts, told Hacker News that under claims rules “a minor error can trigger an automatic denial,” their examples being a mistyped medication time and a missing discharge note.
  • Where does AI belong in a chart workflow? Reading charts, flagging gaps, pre-filling standard sections - and stopping at the signature. A clinician signs every entry, because a 2014 HHS OIG report already warned about EHR features that “mask true authorship.”
  • The sign-off is a workflow step, not a policy memo - see how Tallyfy structures approval gates

Insurers selling plans on HealthCare.gov denied 19% of in-network claims in 2024. That figure comes from KFF’s analysis of federal transparency data on ACA marketplace plans, and it gets worse the closer you look: denial rates ran from 3% at the gentlest insurer to 36% at the harshest, fewer than 1% of denied claims were ever appealed, and when patients did appeal, insurers upheld their own denial 66% of the time.

Upstream of a fair share of those denials sits a piece of paperwork with a small mistake in it.

That’s the setting for one of the more instructive AI launches I’ve read, and for this post. Healthcare is where the stakes of where AI is heading inside regulated work stop being abstract: the documentation is dense and the rules unforgiving, and a wrong entry doesn’t just sit there - it cascades into a denied claim, an appeal a clinic doesn’t have the bandwidth to file, or worse, a treatment decision made on bad information. If you want to understand why AI belongs inside defined workflows with human sign-off, instead of roaming free, watch what happens when software touches a medical chart.

Why do small chart errors turn into big bills?

Because the system that prices the error is automated and literal, and it sits on the payer’s side.

In May 2025, the founders of WorkDone - Dmitry, Sergey, and Alex, a Y Combinator X25 company - launched on Hacker News with an AI product that audits medical documentation in real time. Their problem statement is the cleanest version of the stakes I’ve seen: “Sometimes it’s just a mistyped medication time or a missing discharge note - basic stuff - but when you’re dealing with claims and regulatory rules, a minor error can trigger an automatic denial.” By the time an overworked compliance team finds the slip, they wrote, “it’s usually too late to just fix it.” This wasn’t market research for them, either - the launch post mentions that Dmitry’s family member “faced grave consequences from a misread lab result.” The paperwork problem and the patient-safety problem are the same problem wearing different price tags.

Solution Compliance & Finance
Compliance Management Software

Compliance Management Made Easy

Save Time On Compliance
Track & Delegate
Audit trails
Explore this solution

A commenter named abelanger asked the question an operations person would ask - how often does a denied claim actually trace back to a documentation error, versus an insurer that just denies things. The reply from the founders’ account, digitaltzar, was a blunt tally: “the average is around 25%,” with 250 million claims a year denied over documentation mistakes by their count, and rehab facilities “where this ratio is above 50%.” Treat those as a vendor’s own numbers - WorkDone sells the fix, after all - but notice they don’t need to be exact to make the point, because even on KFF’s independent figures, which count all denials whatever the cause, roughly one claim in five bounces. Of the denials insurers did categorize, 13% were for excluded services, 9% for missing prior authorization or referral, and only 5% on medical necessity - and KFF notes the public data cannot even link a denial reason to the service denied. Documentation problems hide in exactly that fog.

The appeal math seals it. KFF counted about 85 million denied in-network claims in 2024; consumers appealed at least 262,982 of them, under 1%, and insurers upheld their own decision 66% of the time. An appeal is slow and uncertain, and the labor lands in nobody’s budget. Prevention is a checklist running before submission. Once you see those two options side by side, the operational conclusion falls out on its own: the cheapest place to fix a documentation error is upstream, in the seconds after it’s made, while the person who made it is still looking at the record.

Mind you, none of this is new. A 2014 HHS OIG report flagged that EHR features like copy-paste “may be used to mask true authorship of the medical record,” and recommended CMS contractors lean on audit logs precisely because “audit log data distinguish EHRs from paper medical records.” A decade later, charts are still assembled under time pressure by people juggling patients, and the copy-paste habits the OIG worried about are still how a Tuesday note becomes a Thursday denial.

The chain is what matters for anyone running operations: a chart entry feeds a claim, and the claim meets an automated rule that doesn’t ask what you meant.

What WorkDone got right about the design

The product itself is a set of AI agents wired into a clinic’s EHR system, watching records as clinicians work. And the launch post is worth reading less for the product than for the design decisions around it, because the founders made three calls that generalize to any AI touching consequential records.

First, the AI suggests rather than acts. When it spots trouble - “like a missing signature or a suspicious timestamp” - it asks the responsible staff member “to double-check and correct it on the spot.” Second, the human stays the author: for a genuine mistake, “we request correction approval from the provider,” and the system stores “an audit trail for compliance.” Third, they scoped the pilot deliberately: “we are starting with read-only mode,” retrieving data without writing any. Their answer to the inevitable hallucination question follows from the design: since the tool flags possible errors and “its primary effect is to get extra human review,” a wrong flag costs staff a few minutes rather than corrupting a chart.

Funnily enough, the founders’ biggest worry wasn’t the AI doing damage. It was false positives wasting clinicians’ time - the one failure mode their design left open.

That worry is the correct one, and it tells you the design worked.

When the worst case is “a busy nurse dismisses a wrong flag,” you’ve built something a hospital can pilot without a committee losing sleep. The read-only start matters more than it looks, too: an AI that retrieves and inspects but cannot write is an AI whose entire risk story fits in one sentence, which is roughly the length a clinical-governance conversation has patience for. Earn trust at that scope first. Write access, where it ever comes, then arrives one named step at a time instead of as a leap of faith.

AI drafts, flags, and pre-fills chart fields, then a clinician signs every entry before the claim goes out

Generalize the pattern and you get three verbs that define safe AI work on regulated records: suggest, flag, pre-fill. The model drafts the discharge summary’s standard sections. It flags the medication time that contradicts the administration log. It pre-fills the fields a payer’s rules require. What the model never does is commit. Every committed entry carries a clinician’s signature, which means every committed entry passed through a moment where a qualified human looked at it and owned it.

A question we hear from clinic operations teams, in some form, every time AI comes up: can’t the model just file the routine stuff itself and save everyone the click? For low-stakes work, maybe. For a medical chart, the click is the control. Remove it and the audit trail records that software edited a record and nobody looked - which is the exact pattern the OIG was warning about back when the software was merely copy-paste.

Keep the clinician’s name on every entry

The deeper reason auto-commit fails in healthcare has nothing to do with model quality. Authorship carries structural weight here in a way that’s easy to miss from outside the industry: a chart is a legal record, a billing instrument, and a clinical handoff all at once, and the signature is what binds those three roles to a person with a license. An entry without a real author saves a click today and then fails the one question that matters when something goes wrong - who decided this?

That’s why the workflow shape, not the model benchmark, is the thing to get right. In Tallyfy terms, the AI’s contribution lives inside a step, and the step that follows is a blocking approval step with a named owner - not a notification they can ignore, a gate that holds the process until someone with the right role signs. The run history then tracks every step as it happens, so the trail the OIG wanted from EHRs exists for the workflow around the EHR too: which fields the AI pre-filled, who reviewed, what changed, when the claim left.

The biggest lesson we keep relearning at Tallyfy is that a gate is only as real as the process that enforces it. Write “clinician reviews AI suggestions” in a policy document and you have a sentence. Make it a workflow step that the claim literally cannot move past without a signature, and you have a control. The difference shows up the first time someone is busy, which in a clinic is always.

Policy is a hope. A blocking step is a fact.

Worth being precise about what this is and isn’t. A workflow platform doesn’t make an AI deployment HIPAA-compliant, and I’m not going to pretend otherwise - compliance hangs on agreements, access controls, training, and a dozen things beyond any one tool. What the workflow contributes is narrower and checkable: a defined sequence, a named reviewer on every consequential step, and a record that accumulates because the work ran through it. Those are the parts an auditor or a payer can actually verify. Here’s the kind of process where that structure pays off first:

Procedure Example
Medical Insurance Billing and Claims Processing
1Patient check-in and demographics verification
2Insurance Eligibility and Verification
3Medical Coding of Diagnosis, Procedures and Modifiers
4Charge Entry
5Claims submission via clearinghouse
+4 more steps
View template

Notice the template’s shape: intake, verification, coding, a review gate, submission, then denial follow-up as its own track. Now walk a claim through it with AI in the right seats. At intake, a model checks the patient demographics and insurance details against what the payer’s rules will demand, and flags the gap while the front desk still has the patient’s card in hand. At coding support, it drafts codes from the visit documentation and marks the ones it’s least sure about. The completeness check compares the chart against the claim and catches the missing discharge note that would have triggered an automatic denial three weeks later. Then the gate: a person whose name is on the step reviews the flagged items and signs, and only that signature releases the submission. Every flag the model raised, every correction the reviewer made, and the signature itself land in the run’s history without anyone writing a memo about it.

You’ve changed the economics of the process without changing its accountability. The reviewer still signs. The payer still gets a clean claim. What changes is what reaches the reviewer: a pre-checked draft instead of a blank form and a deadline.

Where should a healthcare ops team start?

Skip the moonshot. Start with the claims and documentation workflows you already run, and ask one question of each step: is this step reading, checking, or committing?

Reading and checking steps are AI candidates today. A model reading a discharge packet against a payer’s checklist before submission does the work nobody has time for at the moment it’s cheapest to fix. Remember that 9% of categorized denials were a missing prior authorization or referral - paperwork sequencing, in other words. A model that checks whether the referral is attached before the claim leaves is about as humble as AI gets, a tireless intern with a checklist, and that’s exactly what this job calls for. We’ve written about how healthcare process management lives or dies on handoffs, and about what happened when a telehealth team rebuilt its patient workflows around defined steps - the AI version is the same conversation with higher stakes for getting the handoffs right. The committing steps stay human, each with a name attached.

Then let the structure do the boring work. A claims process with an AI completeness check still gets denials sometimes - 3% happens even at the careful end of KFF’s range - so the workflow needs a denial branch with its own deadlines and owners instead of a pile of follow-ups living in somebody’s inbox. That’s a bit unglamorous, the kind of quiet process improvement nobody demos on stage. It’s also where the money is: an appeal filed inside the payer’s window, by a process that tracks the window, beats a brilliant model attached to no process at all.

One more plain note: the hard part of this isn’t software, ours or anyone’s. It’s getting a clinic’s actual sequence of work written down clearly enough that a step can be handed to a model in the first place - the same definition work that makes or breaks AI-built workflows everywhere else.

Healthcare just raises the price of skipping it.

A mistyped medication time is a small error. A process that lets it travel unreviewed from a busy clinician’s keyboard to an automated payer rule is the big one. Fix the second and the first stops costing you appeals.

The chart stays human. The checking gets help.

Get that division of labor right and the rest follows.

About the author

Amit is the CEO of Tallyfy. He has 25+ years of practical experience in technology, entrepreneurship, and operational efficiency. He's been hands-on with AI-first engineering and changing Tallyfy to AI-native workflow automation since Claude Code was first released. He's also an Entrepreneur in Residence at WashU's Skandalaris Center, created the OneDay (Woolf) AI curriculum for their accredited MBA and consults with clients who need help with AI via Blue Sheen. He graduated with a Computer Science degree from the University of Bath. He's originally British and lives in St. Louis, MO.

Find Amit on his website , LinkedIn , or GitHub . Read Amit's bio →

Automate your workflows with Tallyfy

Stop chasing status updates. Give people and AI a process to follow.