The Error Taxonomy: Classify First, Then Intervene
Module 5: Human Error, Failure Modes, and Recovery Depth: Foundation | Target: ~2,500 words
Thesis: Human errors follow a taxonomy — slips, lapses, mistakes, violations — and each type has a different cause, a different fix, and a different system design implication.
The Operational Problem
A hospital pharmacy technician prepares the wrong dose of a high-alert medication. A patient receives 10 mg instead of 1 mg. The organization’s response: mandatory retraining on medication safety for all pharmacy staff, a new policy requiring double-checks, and a stern email from the chief medical officer about the importance of vigilance.
This response is almost certainly wrong. Not because retraining and double-checks are bad — but because the organization has not asked the only question that matters: what kind of error was this?
If the technician grabbed the wrong vial because the 1 mg and 10 mg vials are identically shaped and shelved adjacent to each other, that is a slip. Training will not fix it. Redesigning the shelf layout and using tall-man lettering on labels will.
If the technician was interrupted during preparation, lost her place, and skipped the verification step, that is a lapse. Training will not fix it. A forcing function — a system that will not dispense until barcode verification is complete — will.
If the technician applied a pediatric dosing protocol to an adult patient because the patient’s chart was ambiguous, that is a rule-based mistake. Training might help, but better decision support at the point of prescribing — a weight-based dosing calculator embedded in the order entry system — addresses the mechanism.
If the technician skipped barcode scanning because the scanner has been broken intermittently for three weeks and workarounds have become routine, that is a violation. But it is not a disciplinary problem — it is a system design failure. The organization created the conditions under which violating the protocol became the only way to get work done.
Same outcome. Four different causes. Four different fixes. “Human error” as a diagnostic category is useless. It tells you that a person was involved in the failure. It tells you nothing about why the failure occurred or what to change. The taxonomy exists to make that distinction, and the distinction determines whether the intervention works or whether it is organizational theater.
Why Classification Matters: The Default Response Problem
Healthcare’s default response to error is training. When something goes wrong, the most common organizational reflex is to retrain the person who made the error, retrain the department, issue a new policy, or add a procedural step. This reflex is not random — it is a reasonable response to one specific error type (knowledge-based mistakes) that has been generalized to all error types regardless of mechanism.
The problem is arithmetic. Reason (1990) estimated that skill-based errors (slips and lapses) account for the largest share of human errors in complex systems, followed by rule-based mistakes, with knowledge-based mistakes — the only type that training directly addresses — constituting a relatively small proportion. The exact proportions vary by domain and study, but the pattern is consistent: the error type most responsive to training is the least common, while the most common error types require environmental and system-level interventions that training cannot provide.
When an organization defaults to training for every error, it is addressing the mechanism correctly roughly 10-20% of the time and missing the actual cause 80-90% of the time. The result: the same errors recur, the organization adds more training, and the cycle repeats. The error rate does not improve because the intervention does not match the error type. This is not a failure of effort. It is a failure of classification.
Rasmussen’s Skills-Rules-Knowledge Framework
The taxonomic foundation comes from Jens Rasmussen’s (1983) Skills-Rules-Knowledge (SRK) framework, which classifies human behavior into three levels based on the cognitive resources required.
Skill-based behavior is automatic. It is the performance of practiced routines without conscious attention — the way an experienced nurse draws medication into a syringe, the way a pharmacist reaches for a familiar vial on a familiar shelf, the way a surgeon ties a knot. These behaviors are fast, efficient, and require minimal working memory. They are governed by stored motor programs and perceptual patterns, not by deliberate thought. The advantage is speed and cognitive economy. The vulnerability is that automatic behavior is difficult to monitor — errors occur when the routine executes in the wrong context, and the person may not notice because conscious attention was directed elsewhere.
Rule-based behavior operates on if-then logic. The person recognizes a situation as belonging to a familiar category and applies a stored rule: if the patient’s potassium is below 3.5, then replace per protocol; if the INR is above 3.0, then hold the warfarin. Rule-based behavior requires more cognitive engagement than skill-based behavior — the person must correctly identify the situation and select the appropriate rule — but it does not require analytical reasoning from first principles. Most clinical protocols, checklists, and standard operating procedures are designed to support rule-based behavior: they provide the if-then mapping so the clinician does not need to reason through the underlying pharmacology or physiology every time.
Knowledge-based behavior is analytical reasoning from first principles. It is engaged when the situation does not match any familiar pattern and no stored rule applies. The clinician must construct a mental model of the problem, generate hypotheses, evaluate options, and reason through consequences. This is the slowest, most cognitively expensive, and most error-prone level of behavior — but it is the only level that can handle genuinely novel situations. A clinician encountering an unusual drug interaction not covered by existing protocols, or a program manager facing a novel regulatory requirement with no precedent, must operate at the knowledge-based level.
The SRK framework is not a typology of people — it is a typology of the cognitive mode in which a person is operating at a given moment. The same nurse operates at the skill level when drawing medication, shifts to the rule level when checking dosing against the protocol, and engages the knowledge level when the patient presents with an unexpected adverse reaction not covered by the standing orders. Errors at each level arise from different mechanisms and require different countermeasures.
Reason’s Error Taxonomy
James Reason’s (1990) Human Error built on Rasmussen’s SRK framework to produce the error taxonomy that remains the standard classification in safety science, aviation, nuclear power, and — increasingly — healthcare.
Slips: Right Intention, Wrong Execution
Slips are skill-based execution errors. The person intends the correct action but executes it incorrectly. The plan is right; the performance is wrong. Norman (1981) identified the principal sub-types of action slips, which Reason incorporated into the broader taxonomy.
Capture errors occur when a more frequently practiced routine “captures” an intended action. A nurse intending to walk to the medication room walks to the break room instead, because the break room route is the more habitual path from her current location. In medication preparation, a pharmacist reaching for furosemide 40 mg grabs famotidine 40 mg because the vials are similar in size and color and the reaching motion is identical.
Description errors occur when the internal description of the intended action is too vague to discriminate between similar objects or actions. The technician needs the 1 mg vial and grabs the 10 mg vial because the internal instruction — “get the [drug name] vial from the second shelf” — does not include sufficient distinguishing detail. The physical environment provides no additional discrimination: same shelf, same vial shape, similar label.
Mode errors occur when the person performs an action appropriate for one mode of a system while the system is in a different mode. A clinician enters orders into an EHR assuming the patient context is Patient A when the system has already switched to Patient B after an interruption. The keystrokes are correct for the intended patient; the system context is wrong.
The common thread: slips arise from automatic behavior executing in an environment that does not provide sufficient cues to catch the error. The fix is never “pay more attention” — attention is precisely what is not engaged during automatic behavior. The fix is environmental: forcing functions (Norman, 1988) that prevent the wrong action from completing (barcode verification that rejects the wrong vial), distinctive design that makes similar objects physically distinguishable (tall-man lettering, color-coded caps, different vial shapes for different concentrations), and better feedback that makes the system state visible (prominent patient-name display that changes color when the context switches).
Lapses: Right Intention, Omitted Step
Lapses are skill-based memory errors. The person intends to complete a sequence of actions but omits a step, usually because of an interruption, distraction, or memory failure.
A pharmacy technician is preparing a medication and is interrupted by a phone call. After the call, she resumes preparation but skips the barcode verification step — not because she decided to skip it, but because the interruption disrupted her place in the procedural sequence, and the verification step was lost from working memory. She did not forget that verification exists; she forgot that she had not yet done it.
Lapses are ubiquitous in healthcare because healthcare workflows are constantly interrupted. Westbrook et al. (2010) found that nurses were interrupted an average of 6.7 times per hour during medication administration rounds. Each interruption is a potential lapse trigger — a point where a procedural step can be lost from working memory.
The fix for lapses is not training (the person already knows the procedure) and not motivation (the person intended to complete the procedure). The fix is environmental support for memory: checklists that externalize the procedural sequence, workflow automation that enforces step completion, physical workspace design that makes the current step visible (the incomplete preparation sits in a designated “in-progress” zone until verification is confirmed), and interruption management protocols that protect safety-critical task segments from disruption.
Mistakes: Wrong Plan, Correct Execution
Mistakes are failures of judgment or reasoning. Unlike slips and lapses, the execution matches the intention — but the intention is wrong. The person correctly carries out an incorrect plan.
Rule-based mistakes occur when the person misidentifies the situation or selects the wrong rule. A resident applies a pediatric dosing guideline to an adult patient because the patient’s low weight triggered a mental category association with pediatric patients. The rule was correctly executed; the wrong rule was selected. Alternatively, the right rule is selected but misapplied — the clinician correctly identifies that the patient needs anticoagulation but applies an outdated dosing protocol that has been superseded. Rule-based mistakes arise from miscategorization, the application of strong-but-wrong rules, and the failure to update stored rules when protocols change.
The fix for rule-based mistakes: better decision support at the point of action (a dosing calculator that incorporates patient-specific parameters and flags protocol mismatches), protocol design that makes the correct rule obvious (clear inclusion/exclusion criteria, not just procedure steps), and training specifically targeted at the discrimination between commonly confused situations (when pediatric vs. adult dosing applies, when the old vs. new protocol applies). Training works here — but only when it targets the specific discrimination, not when it is generic “medication safety” training.
Knowledge-based mistakes occur in novel situations where no stored rule applies and the person must reason from first principles. A pharmacist encounters an unusual multi-drug interaction not covered by the knowledge base. She reasons through the pharmacology, makes a judgment call, and gets it wrong — not because she was careless or poorly trained, but because the situation exceeded the applicable rule set and her analytical reasoning under time pressure produced an incorrect conclusion.
Knowledge-based mistakes are the hardest to prevent because they occur precisely when existing knowledge is insufficient. The fixes are structural: consultation protocols that mandate expert input for situations outside the standard rule set (a clinical pharmacist phone line for unusual interactions), time-out procedures that create deliberate space for analytical reasoning before committing to an action in unfamiliar territory, and simulation training that exposes practitioners to novel scenarios in low-stakes environments so they can build mental models for situations that rules do not cover. Knowledge-based mistakes are the one error type where “more training” is a legitimate intervention — but it must be simulation-based training in novel problem-solving, not repetitive drilling on known procedures.
Violations: Deliberate Deviation from Procedure
Violations are categorically different from errors. Errors are unintentional — the person did not mean to do the wrong thing. Violations are intentional deviations from a known rule or procedure. The person knows what the rule requires and chooses not to follow it.
Routine violations arise when the rule is poorly designed, impractical, or incompatible with the actual work. The barcode scanner in the pharmacy has been malfunctioning for three weeks. It rejects valid scans 40% of the time, requiring the technician to re-scan repeatedly — adding 2-3 minutes per medication in a workflow that is already behind. Technicians begin bypassing the scanner and verifying manually (or not at all). This is a routine violation: a predictable, widespread adaptation to a system that has made compliance impractical.
Routine violations are the most important category for organizational learning because they are diagnostic of system design failure. When frontline workers routinely deviate from a procedure, the procedure is wrong — not the workers. The rule was designed without adequate understanding of the operational environment, or the operational environment has changed since the rule was written, or the tools required to follow the rule are unreliable. The fix is not enforcement; it is redesigning the rule or fixing the tools so that compliance is the path of least resistance.
Exceptional violations occur in extreme, unusual circumstances where the person judges that following the rule will produce a worse outcome than deviating from it. A physician deviates from a treatment protocol because the patient’s presentation is so atypical that the protocol clearly does not apply, and waiting for formal protocol exception approval would cause dangerous delay. Exceptional violations may be clinically appropriate — they represent the practitioner using professional judgment to override a rule that was not designed for this case.
The fix for exceptional violations is not prevention but preparation: designing protocols with explicit exception pathways, creating real-time consultation mechanisms for edge cases, and building a culture that distinguishes between thoughtful clinical judgment (which should be supported) and reckless deviation (which should not).
Just Culture: Why the Error-Violation Distinction Matters
The distinction between errors and violations has direct consequences for organizational justice — what David Marx (2001) formalized as the “just culture” framework.
The just culture model holds that errors (slips, lapses, and mistakes) are not blameworthy. The person did not intend the wrong outcome. Punishing errors suppresses reporting without reducing error rates, because errors arise from system conditions and cognitive limitations, not from choices. When organizations punish errors, they create an environment where errors are hidden rather than reported, and the information needed to fix the underlying system conditions disappears. This is the mechanism by which blame-oriented cultures become unsafe cultures — not through any lack of good intentions, but through the predictable behavioral response to punishment.
Routine violations indicate system design failure, not individual misconduct. When an entire department bypasses a safety procedure, the question is not “why are these people non-compliant?” but “what is wrong with the procedure or the tools that makes compliance impractical?” Disciplining individuals for routine violations treats the symptom while preserving the cause.
Reckless behavior — knowingly and unjustifiably creating substantial risk — is the only category that warrants disciplinary response. This is a narrow category. It does not include errors. It does not include routine violations caused by system failures. It includes the conscious choice to take a risk that no reasonable practitioner in the same circumstances would take, without regard for the potential harm. Marx’s framework requires the organization to ask: “Would three other practitioners with similar training and experience have made the same choice in the same circumstances?” If yes, the system is at fault. If no, the individual may be at fault.
This framework connects directly to error reporting culture (see HF Module 7, psychological safety). Just culture is the prerequisite for a functioning safety reporting system. If frontline workers believe that reporting an error will result in punishment, they will not report. If they believe that routine violations will be met with blame rather than system investigation, they will not disclose the workarounds that reveal where the system is failing. The organization loses the information it needs to improve. Dekker (2006) described this as the choice between “who is responsible” and “what is responsible” — and argued that safety improves only when organizations consistently choose the latter.
Healthcare Example: Four Paths to the Same Harm
A patient on a medical-surgical unit receives 10 mg of metoprolol instead of the prescribed 1 mg. The outcome is identical in all four scenarios — the patient develops symptomatic bradycardia. The cause, and therefore the fix, is different in each case.
Path 1 — Slip. The pharmacy stocks metoprolol 1 mg and 10 mg in vials of identical shape and color, shelved adjacent to each other in alphabetical order. The technician, performing a routine pick, reaches to the correct shelf location and grabs the 10 mg vial. Her hand went to the right place; the wrong concentration was in that place. The error was in the environment, not the execution. Fix: Separate the concentrations by shelf location, use vials with distinct physical characteristics (different cap color, different vial size), implement tall-man lettering on labels (metoprolol 1 mg vs. metoprolol 10 mg), and add barcode verification at the point of pick.
Path 2 — Lapse. The technician picks the correct 1 mg vial. During preparation, she is interrupted by a colleague with an urgent question about another order. She sets down the 1 mg vial, addresses the question, returns to her workstation, and — having lost her place in the preparation sequence — picks up a 10 mg vial that was already on the counter for a different patient’s order. She skips the re-verification step because, in her memory, she already verified the vial before the interruption. She did verify — but she verified the first vial, not the one she is now holding. Fix: Designate preparation zones that physically isolate each patient’s medications, implement a forcing function that requires re-verification after any interruption, establish interruption-free zones during safety-critical preparation steps.
Path 3 — Rule-based mistake. A new pharmacist reviews the metoprolol order. The patient weighs 45 kg. The pharmacist, trained in a pediatric rotation where weight-based dosing is standard, mentally applies a weight-based calculation and concludes that 10 mg is the appropriate dose for this patient’s weight. The rule she applied — dose by weight — is a valid rule in pediatric practice but incorrect for adult metoprolol dosing, which uses standard doses independent of weight for most indications. The right rule for the wrong context. Fix: Embed weight-based vs. standard dosing logic into the order verification system so the decision support flags when a weight-based calculation is applied to a medication that uses standard adult dosing, coupled with training that specifically targets the pediatric-to-adult dosing distinction.
Path 4 — Routine violation. The pharmacy’s barcode verification system has been unreliable for three weeks — the scanner rejects valid scans approximately 40% of the time, requiring multiple re-scans and adding substantial time to each verification. Technicians have begun scanning medications in batch mode (scanning all items for all patients at once, then sorting) rather than scanning per-patient as the protocol requires, because the per-patient workflow with a malfunctioning scanner has become unworkable. In this workaround, the metoprolol 10 mg scanned successfully and was associated with the wrong patient. The technician did not choose to be unsafe; she adapted to a system that made the safe process impractical. Fix: Fix the scanner. Not next month — this week. Then examine why a three-week equipment failure in a safety-critical system was tolerated without escalation. The violation is a symptom; the organizational tolerance of broken safety infrastructure is the disease.
Warning Signs
These indicators suggest an organization is misclassifying errors and applying the wrong interventions:
- Recurring errors of the same type despite retraining — the intervention does not match the error mechanism
- Widespread workarounds that everyone knows about but no one reports — routine violations normalized because the rules are impractical
- Incident reports that consistently conclude with “staff was counseled” — blame-oriented response suppressing system-level investigation
- No distinction between error types in incident analysis — every event classified as “human error” without further taxonomy
- Frontline staff reluctant to report near-misses — just culture is absent or performative, and reporting feels punitive
- New procedures added after every incident but old procedures never removed — procedural accretion without system redesign
- Equipment reliability problems documented in workaround reports but not in maintenance tickets — the system has learned to route around failures rather than fix them
Integration Points
OR Module 4 (Network Flow and Referral Networks). Network handoffs — the points where patients, information, or responsibility transfer from one node to another — are where slips and lapses concentrate. Every handoff is a context switch, and every context switch is a lapse opportunity: the receiving node may not have the same mental model as the sending node, information may be lost in transmission, and the procedural sequence resets in an environment where the cues supporting step completion are different. The network topology determines how many handoffs occur, and therefore how many lapse-vulnerable transitions a patient encounters. A referral network with six handoff points between primary care and specialty resolution has six opportunities for information loss — and the error taxonomy predicts that these losses will be predominantly lapses (omitted information, forgotten follow-up steps) and slips (patient A’s information merged with patient B’s during a batch handoff). Network design that minimizes unnecessary handoffs is, simultaneously, error prevention.
HF Module 7 (Organizational Behavior and Team Dynamics). The error taxonomy is operationally useful only if errors are reported, and errors are reported only in environments with psychological safety and just culture. When organizations punish errors, frontline workers suppress reporting, and the organization loses the data it needs to classify errors correctly and apply the right interventions. Module 7’s treatment of psychological safety (Edmondson, 1999) and high-reliability organization principles provides the cultural infrastructure without which the error taxonomy remains an academic exercise. The taxonomy tells you what to do with an error once you know about it; just culture determines whether you ever learn about it in the first place.
Product Owner Lens
What is the human behavior problem? Humans make errors through multiple distinct mechanisms — automatic execution failures, memory failures, reasoning failures, and deliberate deviations — but organizations treat all errors as a single category and apply a single intervention (training) that addresses only one error type. The result is recurring errors, wasted training resources, and progressive erosion of safety culture as frontline workers observe that the organization’s responses do not match the problems they experience.
What cognitive mechanism explains it? Rasmussen’s SRK framework identifies three cognitive modes — skill-based (automatic), rule-based (if-then), and knowledge-based (analytical) — each with a different error mechanism. Reason’s taxonomy maps error types to these modes: slips and lapses at the skill level, mistakes at the rule and knowledge levels, violations as a separate category driven by system design rather than cognition. Each mechanism has a different causal chain and therefore a different intervention target.
What design lever improves it? Incident reporting systems that require error-type classification (slip, lapse, rule-based mistake, knowledge-based mistake, routine violation, exceptional violation) at the point of report. Root cause analysis templates structured around the taxonomy rather than around generic “human error” categories. Intervention selection guided by error type: environmental redesign for slips, workflow automation for lapses, decision support for rule-based mistakes, consultation protocols for knowledge-based mistakes, system investigation for routine violations.
What should software surface? Error-type distribution across departments and time periods — if 70% of medication errors in a unit are slips, the intervention is environmental redesign, not training. Recurrence rates by error type and intervention — if slip-related errors persist after retraining, the metric reveals the intervention-mechanism mismatch. Routine violation reports correlated with equipment maintenance status and procedure age — to identify system conditions that drive workarounds. Time-from-report-to-system-change for routine violations — the leading indicator of whether the organization acts on system design failures or merely documents them.
What metric reveals degradation earliest? The ratio of system-level interventions to individual-level interventions following incident reports. When an organization responds to 90% of incidents with retraining and 10% with system redesign, the taxonomy is not being used — regardless of what the incident report forms say. This ratio is the leading indicator of whether the error classification system is functioning or decorative. A well-functioning system should produce a ratio that approximately mirrors the error-type distribution: predominantly environmental and system interventions (for the slips, lapses, and violations that constitute the majority of errors), with individual training interventions reserved for the knowledge-based mistakes that training can actually address.