Cognitive Architecture: The Information-Processing Pipeline

Human cognition is an information-processing system with a fixed architecture. Sensory input enters through perception, competes for attention, is held and manipulated in working memory, feeds into decision processes, and produces motor action. Each stage has measurable capacity, quantifiable throughput, and characteristic failure modes. These are not metaphors. They are engineering parameters — as real and as constraining as the throughput of a network switch or the capacity of a hospital bed. A system designed without reference to these parameters will fail at the human layer, regardless of how well its technical components perform.

This page maps the architecture stage by stage, identifies the capacity limits at each stage, and shows how those limits create predictable failure modes in healthcare operations. The central claim is simple but consequential: every operational system has a tightest cognitive bottleneck, and that bottleneck — not training, not motivation, not technology — determines the system’s effective performance ceiling under load.

The Information-Processing Model

The dominant model in human factors engineering treats the human operator as a multi-stage processor. Information flows through a pipeline:

Perception → Attention → Working Memory → Decision → Action

This is not a single linear queue. Stages operate in parallel to a degree, feedback loops exist between stages, and the system is capacity-limited at multiple points simultaneously. But the serial metaphor captures the essential constraint: information that fails to pass one stage never reaches the next. A monitor alarm that is not perceived cannot be attended to. Information that is not attended to does not enter working memory. Data not held in working memory cannot inform a decision.

The model descends from Broadbent’s filter theory (1958), was elaborated by Wickens in his multiple resource model, and has been refined through six decades of experimental research in cognitive engineering. It is not the only model of cognition — ecological and embodied cognition approaches offer valid alternatives — but it is the model most directly useful for system design, because it makes quantitative predictions about where human performance will degrade under specific conditions.

Each stage has three properties that matter for system design:

Capacity — the maximum amount of information the stage can process at one time.
Throughput — the rate at which information moves through the stage under sustained load.
Failure mode — the characteristic way the stage breaks down when capacity or throughput is exceeded.

Stage 1: Perception

Perception is the front end of the pipeline — the process by which sensory input is transduced into a signal the cognitive system can use. In operational settings, the relevant question is not whether a stimulus exists but whether the operator detects it.

Detection thresholds. Every sensory channel has an absolute threshold (the minimum stimulus intensity detectable in ideal conditions) and a difference threshold (the minimum detectable change in an ongoing stimulus). Weber’s Law quantifies the difference threshold: the just-noticeable difference (JND) is a constant proportion of the baseline stimulus intensity, not a fixed amount. A nurse monitoring a patient’s respiratory rate of 20 breaths per minute needs a change of roughly 2-3 breaths per minute to reliably notice a shift. At 10 breaths per minute, the same proportional change is only 1-1.5 breaths. Weber’s Law means that subtle deterioration from an already-abnormal baseline is harder to detect than the same magnitude of change from a normal baseline — a fact with direct implications for monitoring critically ill patients whose vitals are already deranged.

Change blindness. Humans are remarkably poor at detecting changes in a visual scene when the change occurs during a brief interruption — a saccade, a blink, a scene cut. Simons and Chabrier’s (1999) gorilla experiment is the popular example, but the operational reality is more mundane and more dangerous. A nurse who looks away from a monitor to document in the EHR and looks back may not detect that an SpO2 value has dropped from 94% to 89% if the transition occurred during the gaze shift. The display looks “normal” because the visual system does not compare frames the way a camera does; it reconstructs the scene each time. Change blindness is not a failure of attention. It is a structural limitation of visual perception.

Inattentional blindness. Even without a gaze interruption, a stimulus that is fully visible can go undetected if the operator’s attention is directed elsewhere. Inattentional blindness — demonstrated experimentally by Mack and Rock (1998) — means that perception is not passive reception. It requires attentional resources. During high-workload periods in an ED, monitor alarms that are acoustically present and visually displayed can be functionally invisible because the clinician’s perceptual resources are fully committed to the task at hand. The stimulus reaches the retina or cochlea but never becomes a percept.

Healthcare failure mode: The most dangerous perceptual failures in healthcare are not failures to see dramatic changes. They are failures to detect gradual deterioration — a slow decline in urine output, a subtle shift in mental status, a respiratory rate that drifts upward by 2 breaths per minute per hour. Weber’s Law predicts exactly this pattern: small proportional changes against a shifting baseline are the hardest signals to detect, and they are precisely the signals that mark early clinical deterioration.

Stage 2: Attention

Attention is the bottleneck between perception and cognition. Sensory systems deliver far more information than the cognitive system can process simultaneously. Attention is the mechanism that selects which information gets processed and which is discarded or degraded.

Wickens’ Multiple Resource Theory (MRT) provides the most operationally useful model of attention allocation. Wickens (2002, 2008) proposes that attention is not a single pool but a set of semi-independent resource channels, organized along three dimensions:

Modality: visual vs. auditory input
Processing code: spatial vs. verbal information
Processing stage: perception/cognition vs. response selection/execution

Two tasks that draw on different resource channels can be time-shared with relatively little interference. A nurse can monitor a visual display (visual-spatial-perceptual) while listening to a physician’s verbal order (auditory-verbal-cognitive) — but performance on both tasks will be somewhat degraded compared to doing either alone. Two tasks that draw on the same resource channel compete directly and produce severe performance degradation. Monitoring two visual displays simultaneously (both visual-spatial-perceptual) creates a structural conflict that no amount of training or motivation can resolve.

This is why a nurse monitoring a cardiac telemetry screen cannot simultaneously process a verbal medication order without performance degradation on one or both tasks. The monitoring task demands visual-spatial-perceptual resources. Processing the verbal order demands auditory-verbal-cognitive resources. These draw on mostly separate channels — the interference is moderate. But the moment the nurse must mentally cross-reference the verbal order against visual information on the screen (checking a dose against the displayed patient weight), both tasks now compete for the same cognitive-verbal processing channel, and one task must wait or be performed poorly.

Selective attention and interruption. Healthcare environments are interrupt-driven. A study by Westbrook et al. (2010) found that ED physicians are interrupted an average of every 9.7 minutes — and each interruption requires an attention switch with a measurable recovery cost. Monsell’s (2003) task-switching research demonstrates that switching between tasks incurs a time penalty of 200-500 milliseconds per switch for simple tasks, and substantially more for complex cognitive tasks. The penalty is not just time. The interrupted task’s mental context is partially lost, requiring effortful reconstruction. In healthcare, this means the physician who is interrupted while formulating a differential diagnosis does not simply resume where they left off. They restart cognitive work that was already partially completed.

Healthcare failure mode: Attention failures in healthcare are not primarily about distraction in the colloquial sense. They are about structural competition for limited processing channels in environments that systematically generate more concurrent demands than the attentional system can service. The problem is architectural, not motivational.

Stage 3: Working Memory

Working memory is the workspace where information is held and manipulated during active cognitive processing. It is the most operationally consequential bottleneck in the cognitive architecture, because virtually every clinical task — medication calculation, differential diagnosis, care planning, handoff communication — requires holding multiple items in working memory simultaneously.

Capacity. Miller’s (1956) landmark paper established the “magical number seven, plus or minus two” as the capacity of short-term memory. Cowan’s (2001) reassessment, using methods that controlled for rehearsal and chunking strategies, revised the estimate downward to 4 ± 1 items. This is the number of independent chunks of information a person can hold in working memory at one time without external support. Four items. Not seven. Not ten. Four.

Chunking. The effective capacity of working memory depends on how information is organized. A string of 12 digits (2, 0, 6, 5, 5, 5, 1, 2, 1, 2) exceeds WM capacity as individual digits but fits easily as a phone number pattern (206-555-1212). Chase and Simon’s (1973) classic chess studies demonstrated that experts achieve larger effective WM capacity not through larger raw capacity but through richer chunking — they encode board positions as meaningful patterns rather than individual piece locations. In healthcare, experienced clinicians chunk patient presentations into syndrome patterns, effectively compressing information. A novice holds “heart rate 110, BP 88/60, altered mental status, lactate 4.2” as four or more independent items. An experienced intensivist chunks this as “septic shock” — one item that carries all four data points plus a treatment algorithm.

Decay and interference. Working memory contents decay within 15-30 seconds without rehearsal (Peterson and Peterson, 1959). New information entering working memory displaces existing contents through interference. Both mechanisms are operational hazards. A nurse who receives verbal report on Patient A’s lab values (potassium 5.8, creatinine 2.1, hemoglobin 7.2) and is then immediately asked a question about Patient B’s discharge plan will lose some or all of the lab values unless they were written down or deeply encoded. The information was not forgotten through carelessness. It was overwritten by the structural limitations of the memory system.

Healthcare failure modes:

Handoff overload. SBAR-structured handoffs typically convey 8-15 discrete information items per patient. For a nurse receiving report on four patients, the total information load is 32-60 items — an order of magnitude beyond working memory capacity. Without written support, information loss is not possible but certain. The items most likely to be lost are those conveyed in the middle of the handoff (the serial position effect: primacy and recency items are better retained) and those that are clinically unusual (they resist chunking into familiar patterns).

Medication calculation under interruption. A weight-based dosing calculation (patient weight × dose per kg ÷ concentration × volume) requires holding 3-4 intermediate values in working memory. An interruption during this calculation clears those intermediate values. The clinician must restart. If they believe they remember where they were but have actually lost a value, the result is a miscalculation — not a random error but a predictable consequence of WM disruption.

Holding patient context across rooms. A physician managing four ED patients simultaneously must maintain a mental model of each patient’s status, pending results, and plan. This is 4 × N items, where N is the number of active clinical data points per patient. At any given moment, only one patient’s model is fully active in working memory. The other three are in long-term memory and must be retrieved — a process that takes time, is incomplete, and is vulnerable to interference from the currently active patient.

Stage 4: Decision

Once information has been perceived, attended to, and held in working memory, it must be used to select a course of action. Human decision-making operates through at least two distinct processing modes.

Dual-process theory. Kahneman (2011) popularized the distinction between System 1 (fast, automatic, pattern-based) and System 2 (slow, effortful, analytical). System 1 operates continuously and without conscious effort. It recognizes a facial expression, reads a word, detects that “something is wrong” with a patient before the clinician can articulate what. System 2 engages when System 1 encounters novelty, conflict, or explicit analytical demand. It is the system that calculates drug doses, works through a differential diagnosis systematically, or evaluates the risks of a surgical approach.

The critical operational insight is not that System 1 is fast and System 2 is slow. It is that System 2 requires working memory. Every analytical decision competes for the same limited WM capacity that is already handling information maintenance. Under cognitive load, System 2 degrades first — and the operator defaults to System 1 pattern-matching even for problems that require analysis. This is not a choice. It is an architectural constraint.

Rasmussen’s Skills-Rules-Knowledge (SRK) framework (1983) provides a finer taxonomy. Skill-based behavior is automatic and requires minimal attention (an experienced nurse drawing blood). Rule-based behavior follows learned if-then protocols and requires moderate attention (following a sepsis bundle: if lactate > 4, then administer fluid bolus). Knowledge-based behavior requires conscious analysis of an unfamiliar situation and demands full WM engagement (managing a patient with an atypical presentation that does not match any known pattern).

Performance degrades as you move up the SRK ladder: skill-based behavior is fast and reliable, rule-based behavior is slower but still robust if the rule is correct, and knowledge-based behavior is slow, effortful, and error-prone — especially under time pressure.

Recognition-Primed Decision Making (RPD). Klein’s (1998) research on expert decision-making in naturalistic settings (firefighters, military commanders, ICU nurses) showed that experts under time pressure rarely compare options analytically. Instead, they recognize the situation as similar to a previously encountered pattern, mentally simulate the first plausible course of action, and execute if the simulation does not reveal a fatal flaw. RPD is fast and usually effective — but it fails when the situation is genuinely novel (the pattern library does not contain a match), when the environment has changed in ways the expert has not updated on, or when base rates have shifted (a familiar pattern that was once common is now rare).

Healthcare failure mode: The dangerous transition in clinical decision-making is the unrecognized shift from knowledge-based reasoning to pattern-matching under load. An ED physician who has been running at high cognitive load for six hours will default to System 1 / skill-based and rule-based processing. If Patient 7 presents with chest pain, the physician’s System 1 fires “ACS workup” — the pattern match. But if this patient’s chest pain is actually caused by aortic dissection (incidence roughly 3 per 100,000, easy to miss on initial presentation), the correct response requires System 2 / knowledge-based processing to recognize that the pattern does not quite fit. Under load, that System 2 engagement may never occur. The physician follows the ACS protocol, the dissection is missed, and the failure is retrospectively attributed to “clinical error” rather than to the architectural reality that the cognitive system was operating in a mode incapable of detecting the anomaly.

Stage 5: Action

The final stage converts a decision into a motor response. Action execution has its own failure modes, though these receive less attention in healthcare than the upstream cognitive stages.

Slips. Norman (1981) distinguished slips (correct intention, incorrect execution) from mistakes (incorrect intention, correct execution of the wrong plan). Slips are execution errors: selecting the wrong item from a medication list, clicking the adjacent patient’s order entry, transposing digits in a dose. They are more frequent under time pressure and fatigue and are largely independent of expertise — experienced clinicians slip as often as novices, because slips occur at the motor level, below the threshold of conscious control.

Speed-accuracy tradeoff. Fitts’ Law (1954) and subsequent research establish that faster responses are less accurate. This tradeoff is not a guideline but a psychophysical law. An EHR interface that requires rapid selection from a dense, small-target medication list is engineering slip errors into the workflow. The speed-accuracy tradeoff means that any system that pressures faster action will produce more action errors — a predictable, quantifiable relationship.

The Bottleneck Concept

Each stage of the cognitive architecture has a capacity limit. System design must respect the tightest bottleneck — the stage with the lowest effective throughput for the task at hand.

In most healthcare operational contexts, the tightest bottleneck is not perception (clinicians can see and hear adequately) and not motor response (clinicians can type and click fast enough). The tightest bottleneck is attention or working memory.

Attention is the bottleneck when the environment generates more concurrent demands than can be time-shared across the available resource channels — the multi-patient, multi-screen, multi-communication-channel environment of a busy ED or ICU.

Working memory is the bottleneck when the task requires holding and manipulating more items than WM capacity permits — medication calculations, multi-patient status tracking, complex handoffs.

The practical implication is direct: investing in better displays (perception) or faster input devices (action) will not improve system performance if the bottleneck is attention or working memory. This is the cognitive equivalent of adding lanes to a highway when the bottleneck is the exit ramp. The upstream investment is wasted because the constraint is downstream.

Healthcare Grounding: ED Physician During a Four-Patient Surge

Consider an emergency department physician managing four simultaneous patients during a surge period. Map the cognitive demands against the architecture:

Perception. Two monitors display telemetry for beds 4 and 7. A third screen shows the EHR tracking board. Patient in bed 5 is visible through the doorway. The perceptual load is high but manageable — these are distinct spatial channels. The perception risk is change blindness: a telemetry change that occurs while the physician’s gaze is on the EHR will likely go undetected until the next deliberate monitoring scan.

Attention. The physician is simultaneously processing visual information (monitors, EHR), auditory information (a nurse reporting on bed 4, the patient in bed 5 asking a question, the overhead page for a consultant callback), and verbal-cognitive information (formulating orders while listening to the nurse). Wickens’ MRT predicts that the visual-spatial channel is overloaded (two monitors plus EHR), the auditory-verbal channel is overloaded (nurse report plus patient question), and the cognitive-verbal channel is saturated (formulating orders while processing incoming verbal information). Something must be shed. The physician will unconsciously deprioritize one or more information streams — most likely the telemetry monitors (passive monitoring loses to active task demands) and the patient’s question (lower perceived urgency).

Working memory. The physician is holding: Patient A’s pending lab results (CBC, BMP, lactate — 3 items awaiting interpretation), Patient B’s medication allergies (penicillin, sulfa — 2 items constraining the antibiotic order being formulated), Patient C’s timeline (arrived 90 minutes ago, CT ordered 40 minutes ago, should be back — 2 items), and Patient D’s chief complaint and initial vitals (chest pain, BP 158/92, HR 104 — 3 items). That is 10+ items across four patient contexts, only one of which is fully active in WM at any moment. The physician is operating above Cowan’s 4 ± 1 limit for the active patient and far above it for the aggregate cognitive load.

Decision. Patients A, B, and C present familiar patterns. The physician is operating in rule-based mode (Rasmussen): if lactate > 2, then sepsis pathway; if penicillin allergy, then azithromycin; if CT delayed, then call radiology. Patient D is different. The chest pain is atypical — pleuritic, positional, with a history of recent air travel. This requires knowledge-based reasoning to evaluate PE risk versus MSK pain. But the physician’s System 2 capacity is depleted by the concurrent demands of the other three patients. The SRK framework predicts that the physician will default to rule-based processing (apply the ACS protocol) rather than engaging the knowledge-based analysis that would flag the PE risk.

Where the architecture predicts failure. The failure point is at the intersection of attention overload and working memory saturation. The physician cannot simultaneously maintain four patient contexts in WM, process incoming information from multiple channels, and engage knowledge-based reasoning for the atypical case. The system will fail at Patient D — the case that requires the most cognitive resources but receives the least, because the other three patients’ demands have consumed the available capacity. This is not a failure of competence. It is a failure of architecture, as predictable as a server crash under excessive load.

The Product Owner Lens

What is the human behavior problem? Clinicians managing multiple patients in high-acuity settings exceed the processing capacity of their cognitive architecture. Information is lost, attention is misallocated, and decisions default to lower-fidelity processing modes.

What cognitive mechanism explains it? The information-processing pipeline has fixed capacity at each stage. Attention (Wickens’ MRT) and working memory (Cowan’s 4 ± 1) are the binding constraints. When demand exceeds capacity, the system sheds load — dropping monitoring tasks, losing WM contents, defaulting from knowledge-based to rule-based processing.

What design lever improves it? Externalize working memory (display patient status, pending results, and active plans in a persistent, glanceable format). Reduce attention competition (consolidate information streams, eliminate low-value interruptions). Support decision mode-matching (flag cases that require knowledge-based reasoning so clinicians allocate System 2 resources appropriately).

What should software surface? (a) A cognitive load indicator — not a subjective self-report but a proxy derived from observable system state: number of active patients, acuity levels, pending orders, recent interruption frequency. (b) Persistent patient context displays that offload WM — the equivalent of a chess clock showing each patient’s pending items, time-in-department, and next required action. (c) Anomaly flags that identify cases deviating from common patterns, prompting the clinician to engage analytical rather than pattern-matching reasoning.

What metric reveals degradation earliest? Time-to-acknowledge for new results and orders. When a clinician’s acknowledgment latency increases — labs sit unreviewed for 20 minutes instead of the usual 8 — the cognitive pipeline is saturated. This metric is measurable from EHR audit logs and precedes clinical errors by a window that may allow intervention (reassignment, backup, load shedding).

Warning Signs

The system relies on memory instead of displays. Any workflow that requires a clinician to hold more than 3-4 items in memory without external support is designed to fail under load. If handoffs are verbal-only, if patient assignments lack visible status boards, if medication calculations happen in the head — the system is betting against the architecture.

Interruptions are treated as a discipline problem. If leadership responds to interruption-driven errors with “clinicians need to focus better,” the organization has misdiagnosed an architectural problem as a motivational one. Interruptions are structural features of the environment. They must be engineered out or their consequences must be engineered around.

Training is the response to every error. When the root cause analysis for a cognitive overload error concludes with “retraining,” it has failed to address the system condition that guaranteed the error. Retraining does not expand working memory. It does not add attentional channels. It does not slow the arrival rate of competing demands. The architecture has not changed, so the failure mode has not changed.

High performers are used as evidence the system works. Some clinicians perform adequately under extreme cognitive load — through exceptional chunking, aggressive external memory use, or simply higher baseline WM capacity. Their success does not validate the system design. It demonstrates survivorship bias. The question is not whether the best operators can cope but whether the median operator can sustain safe performance across a full shift.

Integration Hooks

OR Module 2 (Queueing Foundations). Cognitive processing is itself a queueing system. Tasks arrive at the attentional bottleneck, queue when capacity is exceeded, and are serviced in an order determined by urgency and salience rather than arrival time. The same queueing dynamics that govern patient flow — utilization-delay curves, variability effects, abandonment — govern cognitive task processing. When attentional utilization approaches 100%, cognitive task “wait times” (the delay before a clinician processes a new piece of information) increase nonlinearly, precisely as Kingman’s approximation predicts for any single-server queue. This is not an analogy. It is the same mathematics applied to a different server.

Workforce Module 1 (Workforce as Capacity). Cognitive architecture sets the hard limit on what a worker can process per unit time. Workforce capacity models that count FTEs or hours without accounting for cognitive throughput will overestimate effective capacity. A nurse staffing model that assumes each nurse can manage 5 patients based on time-and-motion analysis of individual tasks will underperform if the aggregate cognitive load of 5 patients exceeds the attentional and WM capacity that the architecture permits. The workforce model must incorporate cognitive capacity as a constraint, not just physical presence and task-time.

Key Frameworks and References

Broadbent’s filter theory (1958) — foundational information-processing model establishing attention as a selective filter
Miller, “The Magical Number Seven” (1956) — canonical short-term memory capacity estimate, revised by Cowan
Cowan’s embedded-processes model (2001) — revised WM capacity to 4 ± 1 items under controlled conditions
Wickens’ Multiple Resource Theory (2002, 2008) — multi-dimensional model of attentional resource allocation
Kahneman, Thinking, Fast and Slow (2011) — dual-process theory of judgment and decision-making (System 1 / System 2)
Rasmussen’s SRK framework (1983) — skill-based, rule-based, knowledge-based taxonomy of human performance
Klein’s Recognition-Primed Decision model (1998) — naturalistic decision-making in time-pressured expert domains
Endsley’s situation awareness model (1995) — three-level model (perception, comprehension, projection) of operator awareness
Norman’s action theory (1981) — slip/mistake distinction in human error classification
Weber’s Law — just-noticeable difference as a constant proportion of baseline stimulus intensity
Simons and Chabris (1999) — inattentional blindness; Mack and Rock (1998) — foundational experimental demonstrations
Monsell (2003) — task-switching costs and the time penalty of attention reallocation
Fitts’ Law (1954) — speed-accuracy tradeoff in motor response, foundational to interface target design