Amit Kothari
Amit Kothari CEO of Tallyfy · Workflow AI Expert

AI in pharma - what GxP requires of your workflows

In brief

GxP rules demand that every record touching product quality or patient safety be ALCOA+ - attributable, legible, contemporaneous, original, accurate, plus complete, consistent, enduring, available. Most AI agents do not meet that bar by default. The fix maps cleanly onto workflow execution, with a qualified human signing every consequential step.

Summary

  • GxP has a data-integrity bar, and it has a name: ALCOA+ - per the UK MHRA’s 2018 guidance, that’s Attributable, Legible, Contemporaneous, Original, Accurate, plus Complete, Consistent, Enduring, Available. Every record affecting product quality or patient safety has to clear it.
  • Does a typical AI agent meet ALCOA+ out of the box? No. It rarely records which model and version acted, when, on whose authority, or whether the original entry was preserved rather than overwritten - the exact attributes the standard demands.
  • The mapping to workflow execution is almost one-to-one - attributable becomes an assignee per step, contemporaneous becomes a timestamp per transition, original becomes no destructive editing. EU Annex 11 says replacing a manual operation must bring “no increase in the overall risk of the process.”
  • AI belongs inside a validated step, with a human signature on every consequential one - see how Tallyfy structures approval gates and audit trails

Before you deploy an AI agent in a pharma operation, it has to answer a question the industry settled long before language models existed: can every action it takes be attributed, timestamped, preserved, and reconstructed? That standard has a name, ALCOA+, and it isn’t a guideline you can argue with. It’s the bar inspectors actually check against, and most AI agents fail it by default, not because they’re inaccurate, but because they don’t record themselves the way GxP demands.

That’s the catch nobody mentions in the demo. An agent that drafts a batch record or flags a deviation can be genuinely useful and still be uninspectable, because usefulness and data integrity are different properties. One is about whether the output is good. The other is about whether you can prove, months later, who did what, when, and whether anyone changed it since. Pharma learned to keep those two ideas separate a long time ago, which is exactly why the rules read the way they do.

This is why AI in regulated industries like pharma lives or dies on workflow design rather than model quality. The good news is that ALCOA+ maps onto workflow execution almost mechanically. The work is putting the AI step in the right place inside that workflow, and keeping a qualified human on every step that matters.

What does ALCOA+ actually require?

ALCOA+ is the data-integrity standard that runs through every GxP discipline - good manufacturing, laboratory, clinical, and distribution practice alike. The UK’s MHRA, in its 2018 data integrity guidance, spells out the acronym plainly: ALCOA means data must be “attributable to the person generating the data,” legible and permanent, contemporaneous, an “original record (or certified true copy),” and accurate. The ”+” adds four more: complete (“the data must be whole; a complete set”), consistent (“self-consistent”), enduring (“durable; lasting throughout the data lifecycle”), and available (“readily available for review or inspection purposes”). Nine properties, and a record has to satisfy all of them to count.

Read that list as an operations person and something jumps out.

Most of these aren’t about the content of a record at all. They’re about its provenance and its lifecycle: who made it, when, on what authority, whether it’s the original or a faithful copy, whether it survived intact, whether you can find it on demand. A perfectly accurate entry that nobody can attribute to a named person, at a known time, fails ALCOA+ just as hard as a wrong one. An inspector who can’t tell who entered a result, or whether someone edited it after the fact, treats the record as unreliable no matter how correct the number turns out to be. Integrity in the GxP sense is a claim about a record’s whole history rather than only its contents, and that history is precisely what a model bolted onto a process tends to leave behind. That’s the part that trips up AI deployments.

Solution Compliance & Finance
Compliance Management Software

Compliance Management Made Easy

Save Time On Compliance
Track & Delegate
Audit trails
Explore this solution

Mind you, none of this is exotic. It’s the same instinct behind a signed, dated lab notebook, written into regulation because paper notebooks were getting replaced by systems that made editing invisible. The MHRA guidance even defines data integrity itself as the degree to which data are “complete, consistent, accurate, trustworthy, reliable” across the whole lifecycle. ALCOA+ is just the checklist version of that sentence.

Map each ALCOA+ letter to a workflow step

Here’s the part that makes this tractable: a well-built workflow produces most of ALCOA+ as a side effect of running, the same way it does in any process you’ve written down properly. You don’t bolt the attributes on at the end. The execution generates them. Each step records who acted and when, each transition is captured as it happens rather than typed in later, and the history accumulates on its own without anyone maintaining a separate log by hand. Walk the mapping letter by letter and it’s almost one-to-one, which is why the workflow layer, not the model, is where pharma AI actually gets decided long before anyone benchmarks a model. Swap the model out next quarter and the integrity properties still hold, because they were never the model’s job to begin with. They were the process’s.

A GxP AI step in a workflow yields attributable, timestamped, original records; skip the gate and ALCOA+ fails

Attributable becomes an assignee on every step, so each action carries the name of the person or system responsible. Contemporaneous becomes a timestamp on every transition, captured when the step happens rather than typed in afterward. Original becomes append-only history, where edits add a new versioned entry instead of overwriting the old one. Accurate maps to validation rules on the fields a step captures. Complete maps to a process that won’t let you skip a required step. The ”+” attributes follow from the same record: consistent, enduring, and available are properties of a single trustworthy run history rather than a pile of spreadsheets and emails.

That mapping is the whole argument for running GxP AI inside a workflow instead of beside one.

Where AI fits is the checking and drafting parts of those steps, well short of the signing. A model can pre-fill a batch record’s standard sections, compare a result against a specification, or flag a deviation for review. The step still belongs to a named human who confirms it. The kind of structured, risk-aware step this calls for is exactly what a template like this one is built to hold:

Procedure Example
AI Risk Assessment and Mitigation Procedure
1Inventory all AI systems in use
2Classify risk level per system
3Assess data privacy exposure
4Evaluate model bias potential
5Review vendor security practices
+5 more steps
View template

Notice the template’s shape: defined inputs, a risk assessment, mitigation steps, and a review gate. Drop an AI step into the assessment and drafting parts, keep the review gate human, and the run history records the rest. The model proposes. The qualified person disposes, with their name on it.

An AI step still needs a human signature

The reason auto-commit fails in GxP has nothing to do with how good the model is. It’s that several ALCOA+ attributes are about human accountability that a model can’t supply. Attributable means a person, or a clearly identified system acting under a person’s authority, owns the action. EU Annex 11 puts the principle bluntly: where a computerised system replaces a manual operation, there should be “no resultant decrease in product quality, process control or quality assurance,” and, the line that matters most for AI, “no increase in the overall risk of the process.” An AI step that commits a batch decision on its own, with no qualified human in the loop, increases that risk. So it fails the principle before you even argue about accuracy.

We had the emphasis backwards at first, too. The early instinct is to ask how accurate the model needs to be before you trust it with a step. That’s the wrong first question.

The right first question is where the signature goes.

EU Annex 11 describes electronic signatures that “have the same impact as hand-written signatures,” are “permanently linked to their respective record,” and carry “the time and date that they were applied.” It also asks, for critical data entered manually, for “an additional check on the accuracy of the data” by “a second operator or by validated electronic means.” Read those two requirements together and you have a job description for AI in a GxP workflow. The model can be the validated electronic check that reads the entry and flags problems. The human provides the signature that carries legal weight.

Take a deviation investigation. An out-of-spec result comes in, and the agent assembles the context a human would otherwise gather by hand: the batch record, the equipment logs, the prior deviations on that line. It drafts a root-cause hypothesis and a proposed corrective action, and none of that commits anything. A qualified investigator reads the draft, accepts or rewrites it, and signs, and that signature is what turns a hypothesis into a CAPA an agency will accept. The agent compressed the gathering. The human kept the accountability, which is the only part that was ever the bottleneck on purpose.

Annex 11 frames the whole thing as risk management rather than paperwork. It says risk management “should be applied throughout the lifecycle of the computerised system taking into account patient safety, data integrity and product quality.” An AI step is just a new lifecycle risk to assess: what happens when the model is wrong, who catches it, and what the record shows afterward. Put the human gate where the risk is highest, which is any step that commits, and you have answered the regulation’s actual question instead of arguing with it.

That division of labor is not a limitation to engineer around. It’s the design.

In Tallyfy terms, the AI’s contribution sits inside a step, and the step that follows is a blocking approval owned by a named person - a gate the batch can’t move past without a signature. The run history records each transition as it happens, which is the contemporaneous, attributable record ALCOA+ is asking for. The same shape shows up wherever AI-built workflows still need human review, and it’s the identical logic that governs AI inside healthcare workflows, where a clinician signs every chart entry for the same reason a QA lead signs every batch.

Why don’t most AI deployments meet the bar?

Because the default way teams deploy a model strips out exactly the attributes ALCOA+ cares about. A chatbot bolted onto a process leaves no durable record of which model version answered, what it was asked, or what it changed. That’s a tough sell in a regulated environment, where the missing metadata is the whole point.

Run the failure modes against the checklist and the gaps are obvious. Attributable breaks when the record says “AI generated this” without naming the model, version, prompt, or the human who accepted it. Contemporaneous breaks when the log is reconstructed later from chat history rather than captured at the moment. Original breaks the instant a model regenerates an output and quietly replaces the prior one, with no versioned trail of what changed. Complete breaks when an agent skips a step it decided was unnecessary. None of those is a model-quality problem. Every one is a record-keeping problem, and a 2018-era regulation already named all of them.

The failure usually looks mundane. A team pilots a model that drafts batch-record entries, and staff accept the drafts with a click. The drafts read beautifully. But the system stores only the accepted text, with no model version, no prompt, and no sign that a reviewer clicked through in two seconds without really reading. A year and a half later an inspector asks who authored a specific entry and how it was verified, and the honest answer is that nobody can reconstruct it. The model was never the weak point. The vanished metadata was.

The pattern is consistent: capable model, broken provenance.

US 21 CFR Part 11 has required this for decades. It calls for “secure, computer-generated, time-stamped audit trails to independently record the date and time of operator entries and actions that create, modify, or delete electronic records,” and for validating systems to ensure “the ability to discern invalid or altered records.” An AI agent operating outside a system that produces those audit trails isn’t partially compliant. It’s invisible to the part of the regulation that matters most. That said, the fix isn’t a special AI feature. It’s running the agent inside a workflow that was already built to satisfy Part 11 for human actions.

The regulation didn’t change for AI. AI just has to live inside what was already there.

Begin inside the validation boundary

Begin where you already have a validated process, not with a brand-new build. Take that process and split its steps into two kinds: the ones that produce or check data, and the ones that commit a decision. A model can take over producing and checking soon, because a wrong output at one of those steps gets caught at the very next human step before it reaches a record that matters. Committing is the other kind: it carries a signature with legal and patient-safety weight, so it stays with a qualified person. That one split keeps the first deployment small, defensible, and inside the validation boundary you already maintain, instead of forcing you to open a new one. The instinct to hand over the exciting, decision-making steps first is the one to push back on.

A misconception we keep meeting when pharma teams ask about AI is that compliance is the thing slowing the project down. It’s the opposite.

The ALCOA+ structure is what makes an AI step deployable at all.

Take a batch-record review. A model reads the completed record against the specification, flags the out-of-range result and the missing initials, and drafts the deviation summary. A qualified person reviews the flags and signs. Every flag the model raised, every correction made, and the signature itself land in the run history with a timestamp and a name, which is to say the step produced ALCOA+ data because the workflow was built to. The reviewer’s time goes to judgment instead of hunting for the gap a model can surface in a second.

Or take a simpler entry point: a QC analyst reviewing chromatography data. The model checks the run against the acceptance criteria and flags the integration worth a second look. The analyst still makes the call and signs the result. You haven’t automated the judgment. You’ve automated the part where a tired human scrolls past the one anomaly that mattered, which is the failure ALCOA+ exists to catch.

Then let the structure do the unglamorous part, the kind of steady process improvement that never makes a conference keynote. Define the process. Validate it. Put the AI step where it reads and checks. Keep a qualified human on every step that commits. This is the same discipline a pharmaceutical distribution operation lives by, now with one step handled by a model, and the audit trail an examiner wants captured because the workflow captured it.

A model that drafts a flawless batch record but can’t be attributed, timestamped, or reconstructed is not an asset in pharma. It’s a finding waiting to happen.

Put it inside the workflow instead.

GxP didn’t make AI harder. It just wrote down, years ago, exactly what a trustworthy automated step has to prove.

About the author

Amit is the CEO of Tallyfy. He has 25+ years of practical experience in technology, entrepreneurship, and operational efficiency. He's been hands-on with AI-first engineering and changing Tallyfy to AI-native workflow automation since Claude Code was first released. He's also an Entrepreneur in Residence at WashU's Skandalaris Center, created the OneDay (Woolf) AI curriculum for their accredited MBA and consults with clients who need help with AI via Blue Sheen. He graduated with a Computer Science degree from the University of Bath. He's originally British and lives in St. Louis, MO.

Find Amit on his website , LinkedIn , or GitHub . Read Amit's bio →

Automate your workflows with Tallyfy

Stop chasing status updates. Give people and AI a process to follow.