OR Metrics for Operators
Seven Numbers for Monday Morning
Most healthcare dashboards fail operators. They display thirty metrics in six tabs, none of them connected to a specific intervention, most of them lagging indicators of problems that were actionable two weeks ago. The dashboard was designed by someone who asked “what data do we have?” rather than “what decision does this metric support?”
Operations research offers a different starting point. Every OR model produces outputs that map to operator decisions: where is the system stressed, what is the binding constraint, and what happens if I do nothing? A small set of OR-derived metrics — each grounded in the theory covered in this discipline’s preceding modules — gives an operator more decision power than a wall of bar charts.
This page defines that metric set, explains why each metric matters mechanistically, specifies the thresholds that signal action, and shows how to read them together. It is the page an operator reads first.
The Problem with Existing Dashboards
Healthcare operations dashboards share four structural failures:
Too many metrics, no decision hierarchy. Thirty metrics presented with equal visual weight means none of them communicates urgency. An operator scanning a dashboard needs to know, in ten seconds, whether something requires action today. Volume counts, revenue figures, patient satisfaction scores, and quality indicators — arrayed side by side — cannot answer that question. They describe the system’s history without diagnosing its present state.
No theoretical grounding. Most dashboard metrics are selected because the data is available, not because the metric connects to a mechanism. “Average length of stay” appears because the EHR calculates it. But without connecting LOS to Little’s Law (L = lambda W), the operator cannot determine whether a rising LOS indicates a service-time problem, an arrival-rate problem, or a discharge bottleneck. The metric is descriptive without being diagnostic.
Lagging indicators dominate. Monthly readmission rates, quarterly patient satisfaction scores, annual revenue trends — these tell you what already happened. By the time a lagging indicator moves, the operational failure it reflects has been compounding for weeks. OR-derived metrics are designed to be leading indicators: utilization position on the delay curve warns of wait-time explosions before they materialize; float erosion warns of schedule overruns before milestones are missed.
No mechanism connecting metric to intervention. A dashboard that shows wait times are rising without indicating whether the cause is high utilization, high variability, or a constraint shift leaves the operator knowing they have a problem but not knowing which lever to pull. OR metrics carry their intervention logic with them — each one points to a specific class of response.
The Essential OR Metric Set
Seven metrics. Each is grounded in a specific module. Each has a threshold logic. Each points to an intervention.
1. Utilization — Position on the Delay Curve
What it measures: The fraction of a resource’s capacity consumed by demand. Provider utilization, bed occupancy, appointment slot fill rate, OR suite hours used.
Why it matters (Module 2 — Utilization-Delay Curve): Utilization is not a percentage to maximize. It is a position on a nonlinear curve. The Kingman approximation (the VUT equation) shows that wait time is proportional to rho/(1-rho), where rho is utilization. At 70%, this factor is 2.3. At 85%, it is 5.7. At 93%, it is 13.3. The same resource, getting modestly busier, produces radically longer waits. A utilization number without its position on this curve is operationally meaningless.
How to display it: Not as a single number. Show utilization as a position on the curve itself, with zone coloring:
- Green (below 75%): Recovery slack is adequate. Variability surges drain quickly.
- Yellow (75-85%): Approaching the knee. Peak-day utilization likely exceeds 90%. Monitor weekly.
- Red (above 85%): Past the knee in any high-variability service. Wait times are in nonlinear territory. Act this week.
These thresholds shift with variability. A low-variability service (scheduled procedures, standardized protocols) tolerates higher utilization before the knee. A high-variability service (ED, behavioral health intake, crisis response) hits the knee earlier. The dashboard must display variability-adjusted zones, not one-size-fits-all color bands.
Threshold that signals action: Utilization above 85% sustained over a rolling 2-week period in any service line with a coefficient of variation above 0.7.
Intervention it points to: Add capacity at the margin (the return is highest precisely here, per the curve’s nonlinearity) or reduce variability through demand smoothing and service standardization.
2. Wait Time — Average AND Percentile
What it measures: Time from a patient’s entry into a queue (referral placed, appointment requested, arrival at ED) to the start of service.
Why it matters (Module 2 — Queueing Foundations, Little’s Law): Wait time is the patient-facing consequence of utilization. But average wait time hides the tail — the patients who wait three or five times the mean. Little’s Law (L = lambda W) connects wait time to system occupancy: if you know two of the three variables, you can derive the third. A rising average wait at stable arrival rates means service time is increasing. A rising P90 wait at stable average wait means variability is growing — the system is approaching the steep part of the curve even though the mean has not moved yet.
How to display it: Two numbers, always together: mean and P90 (90th percentile). The ratio P90/P50 is the variability diagnostic. In a healthy system, this ratio runs 2:1 to 3:1. When it exceeds 4:1, the tail is growing — variability is outrunning capacity buffers. This is the earliest wait-time warning signal, and it arrives before the mean triggers any threshold (Module 1 — Deterministic vs. Stochastic).
Threshold that signals action: P90 wait time exceeding the organization’s access standard (e.g., 14 days for behavioral health intake, 30 minutes for ED bed assignment). Or P90/P50 ratio above 4:1 and trending upward over 4 weeks.
Intervention it points to: If mean is high, the system needs capacity (more servers, longer hours, faster service). If P90 is high but mean is moderate, the system needs variability reduction (smoothing arrivals, standardizing service, eliminating rework loops).
3. Throughput — Entities Served Per Unit Time
What it measures: Completed encounters, discharges, intakes, claims processed, referrals completed — per day, per week, per provider.
Why it matters (Modules 2, 5 — Queueing and Scheduling): Throughput is the capacity check. Little’s Law says L = lambda W, so throughput (lambda, the effective output rate) combined with average time-in-system determines the steady-state census. Throughput should be stable or rising. When it is flat while utilization is rising, something is consuming capacity without producing output — rework, administrative burden, longer-than-necessary service times, or scheduling inefficiency.
How to display it: Throughput per resource (provider, service line, site) trended over time. Overlay with utilization. The diagnostic power is in the comparison: utilization rising while throughput is flat means the system is doing more work per unit of output. This is the operational definition of efficiency loss.
Threshold that signals action: Throughput declining more than 10% over a 4-week rolling period at stable demand. Or throughput flat while utilization has risen 5+ percentage points — the efficiency gap is widening.
Intervention it points to: Investigate what is consuming capacity: increased documentation burden, longer case complexity, scheduling gaps, rework from prior-authorization denials. Throughput degradation at stable utilization almost always points to a process problem, not a demand problem.
4. Abandonment Rate — The Invisible Demand Signal
What it measures: The fraction of demand that enters the system but exits before receiving service. LWBS (left without being seen) in the ED, no-shows in scheduled care, referral drop-off, prescription abandonment.
Why it matters (Module 2 — Abandonment and Access): Abandonment is invisible by default. The patient who leaves generates no chart note, no billing record, no complaint. When your metrics only count served patients, you undercount true demand and overestimate access. The Erlang-A model shows that abandonment rate rises nonlinearly with utilization — the same curve that produces long waits drives patients past their patience threshold. Stable throughput with rising abandonment means the system is shedding load to maintain the appearance of adequacy. True demand = served + abandoned. Without this equation, capacity planning chases observed demand and never closes the gap.
How to display it: Abandonment rate by stage of the care pathway: referral-to-scheduled, scheduled-to-attended, attended-to-completed. The stage with the lowest conversion rate is the binding constraint on access. Track abandonment rate alongside wait time — when both rise together, the system is past the knee and patients are leaving.
Threshold that signals action: LWBS above 2% (EDBA benchmark). Scheduled-care no-show rate above 20%. Referral completion below 60%. Any abandonment metric rising for 3 consecutive measurement periods.
Intervention it points to: Reduce wait time at the stage where abandonment concentrates. If abandonment peaks at referral-to-scheduled, the failure is in the handoff. If it peaks at scheduled-to-attended and correlates with wait-to-appointment time, the system needs faster access, not better reminder calls.
5. Critical Path Status — Float Remaining on Key Tasks
What it measures: For transformation programs, capital projects, and grant-funded implementations: how much schedule slack remains on the tasks that determine the project end date.
Why it matters (Module 4 — Critical Path): The critical path is the longest dependent chain through a project network. Tasks on it have zero float — any delay extends the deadline by the same amount. Tasks off it can slip without consequence, sometimes by months. Most programs treat all tasks as equally urgent, directing energy toward visible workstreams (hiring, community engagement) while invisible critical-path tasks (IT procurement, regulatory approvals) quietly slip. Float erosion rate — how fast non-critical tasks are consuming their slack — is the leading indicator of schedule crisis, arriving weeks before a milestone is actually missed.
How to display it: The critical path highlighted in the project network, with float values for every task updated against actuals. A float burndown chart for the 5 tasks closest to criticality. Alert when any task’s remaining float drops below 2 weeks.
Threshold that signals action: Any critical-path task behind schedule (zero float consumed). Any non-critical task with float consumed faster than calendar time elapsed (it is migrating toward the critical path). PERT analysis showing greater than 30% probability of exceeding the grant deadline.
Intervention it points to: Concentrate resources on critical-path tasks. If the IT procurement is on the critical path and slipping, reassign the procurement officer from non-critical workstreams. Compress critical-path tasks, not all tasks. Never apply “all workstreams accelerate” mandates — they waste effort on tasks with float.
6. Constraint Identification — Which Resource Has the Highest Shadow Price?
What it measures: Among all limited resources (provider hours, beds, OR suites, compliance staff, budget), which one is the binding constraint — the one whose marginal relaxation would improve system performance the most?
Why it matters (Module 3 — Shadow Prices): Shadow prices are the dual variables from constrained optimization. A shadow price of $4,200 on PACU beds means one additional bed-hour improves throughput by $4,200. A shadow price of $0 on OR suites means adding OR capacity improves nothing — the bottleneck is elsewhere. Most healthcare investment decisions optimize non-binding constraints because the visible, expensive resource (the OR, the specialist) gets the advocacy while the true bottleneck (recovery beds, compliance staff, scheduling coordinators) goes unfunded.
How to display it: A ranked list of operational constraints by shadow price (or a proxy: utilization relative to capacity, where the tightest resource with the highest downstream impact ranks first). Update quarterly or when major resource changes occur. Flag when the binding constraint shifts — the strategic picture has changed even if daily operations look stable.
Threshold that signals action: Any resource where slack drops below 10% of capacity (it is approaching binding status). Any investment proposal targeting a resource with a zero or near-zero shadow price (the investment will not improve performance).
Intervention it points to: Invest in the binding constraint. If compliance staff are the bottleneck on a grant program (Module 3 example: shadow price of $340/hour vs. $210/hour for clinicians), add compliance capacity before adding clinicians. If PACU beds constrain surgical throughput, accelerate recovery protocols before building new ORs.
7. Variability — Coefficient of Variation on Arrivals and Service Times
What it measures: The coefficient of variation (CV = standard deviation / mean) of inter-arrival times and service durations. This is the “V” in the VUT equation.
Why it matters (Module 1 — Deterministic vs. Stochastic): Variability determines where the knee of the utilization-delay curve falls. A system with CV of 0.5 can sustain 85% utilization with moderate waits. A system with CV of 1.2 hits the knee at 65-70%. The Variability Buffering Law (Hopp and Spearman, Factory Physics) guarantees that variability will be buffered by capacity, time, or both — the operator chooses which. Rising variability at stable utilization shifts the effective operating point rightward on the delay curve, producing queue buildup before utilization metrics trigger an alarm.
How to display it: CV for arrival times and service times by service line, updated weekly. Trend over time. Display alongside the utilization zone thresholds, which should shift when variability changes. A service line at 80% utilization with CV of 1.0 is in a different zone than one at 80% utilization with CV of 0.5 — the dashboard must reflect this.
Threshold that signals action: CV above 1.0 on either arrivals or service times in any service line operating above 75% utilization. CV trending upward for 4+ weeks at any utilization level — the system is becoming less predictable, and the curve’s knee is migrating leftward.
Intervention it points to: Reduce variability at the source. Smooth arrivals through scheduled access models and demand shaping. Standardize service through protocols, pre-visit planning, and templated workflows. Reduce rework (prior-authorization denials, incomplete referral packets, repeated labs). Variability reduction is operationally equivalent to adding capacity (Module 2) — and usually cheaper.
Reading the Metrics Together: The Monday Morning Decision Framework
Individual metrics diagnose. Metric combinations prescribe.
If utilization is high AND wait time is rising AND abandonment is increasing: The system is past the knee of the delay curve. Patients are experiencing the nonlinear wait-time explosion and leaving. The instinct to book more patients into existing capacity will make it worse. The intervention is capacity addition or variability reduction — not demand management, which is already happening involuntarily through abandonment.
If throughput is flat but utilization is rising: Something is consuming capacity without producing output. The most common culprits: increased documentation burden, longer average case complexity, scheduling inefficiency (gaps between patients despite full templates), or rework from denials and incomplete orders. This is an efficiency problem, not a demand problem. Investigate process, not staffing.
If abandonment is rising but utilization appears moderate: The system is shedding load before it registers as high utilization. True demand exceeds what the utilization metric shows because abandoned patients never consume capacity. Recalculate utilization against true demand (served + abandoned). The actual operating point on the delay curve is higher than the dashboard suggests.
If variability is rising at stable utilization: The knee of the curve is migrating leftward. The same utilization that was safe last quarter is now in the yellow zone. This is the earliest warning signal — it precedes wait-time increases, which precede abandonment increases. Check for sources: new patient populations with different acuity mix, schedule changes that cluster demand, staffing changes that increase service-time variance.
If the binding constraint has shifted: The resource that was the bottleneck last quarter may no longer be the bottleneck. If you expanded PACU capacity and the shadow price on PACU beds dropped to zero while the shadow price on anesthesiologist hours rose, the system’s strategic picture has changed. Continued investment in PACU is now waste. Re-solve for the new bottleneck.
What NOT to Put on the Dashboard
Metrics without intervention logic. If the operator cannot name the action they would take when the metric crosses a threshold, the metric does not belong on the operational dashboard. It may belong in a monthly report or a strategic review — but not in the Monday morning view.
Vanity metrics. Total patient volume, total revenue, total referrals placed — these feel important but carry no diagnostic information. Volume can rise while the system deteriorates (more patients entering, more patients abandoning). Revenue can rise while throughput efficiency declines. Put these in the CFO’s quarterly deck, not the operations dashboard.
Metrics that can be gamed. Average wait time, measured only from patients who complete service, improves when the longest-waiting patients abandon. Provider productivity measured in RVUs rewards documentation-heavy coding patterns rather than clinical efficiency. Any metric where improving the number does not require improving the system is gameable and should be supplemented with its countermeasure (abandonment rate alongside wait time, panel access alongside RVUs).
Metrics that require context the dashboard cannot provide. Patient satisfaction scores are meaningful but unactionable on a Monday morning. A 3.2 score does not tell the COO which service line is the problem, what mechanism produced the dissatisfaction, or what operational lever to pull. Satisfaction is an outcome metric. The OR metrics on this dashboard are the diagnostic layer that explains why satisfaction changes.
Healthcare Example: A Rural Health System COO’s Monday Morning
Dr. Vasquez is the COO of a critical access hospital with three service lines: primary care (4 providers), behavioral health (2 providers), and emergency services (24/7 ED with 8 treatment spaces). She opens her dashboard at 7:15 AM Monday. Seven metrics, three service lines. The review takes ten minutes.
| Metric | Primary Care | Behavioral Health | Emergency |
|---|---|---|---|
| Utilization | 78% (green) | 91% (red) | 82% (yellow) |
| Wait time — mean | 4 days | 32 days | 38 min |
| Wait time — P90 | 8 days | 58 days | 94 min |
| Throughput | 68 visits/week (stable) | 22 intakes/month (down 12%) | 41 patients/day (stable) |
| Abandonment | 14% no-show (stable) | 34% referral drop-off (up from 26%) | 1.8% LWBS (stable) |
| Constraint | None binding | Provider hours (shadow price highest) | Bed turnover approaching binding |
| Variability (CV) | 0.6 arrivals / 0.5 service | 0.9 arrivals / 0.8 service | 1.1 arrivals / 0.7 service |
Primary care is healthy. Utilization in the green zone, stable throughput, no-show rate within normal range, no binding constraint. No action needed.
Behavioral health is in crisis. Utilization at 91% with high variability (CV 0.9 arrivals) means the system is deep into the nonlinear zone — the VUT equation at these parameters predicts wait times roughly 8x the mean service time. The 32-day mean wait confirms this. Throughput is down 12%, which means the system is producing less output while running hotter — a process efficiency problem compounding the capacity problem. Abandonment has risen from 26% to 34%, meaning the system is shedding one-third of referred patients before they receive care. True demand is not 22 intakes per month; it is approximately 33 (22 served / 0.66 completion rate). The constraint analysis confirms provider hours are binding. Dr. Vasquez’s decision: this is the one service line that needs action this week. She authorizes a locum behavioral health provider search, requests a review of intake protocols for service-time standardization (attack variability), and asks the BH lead to calculate whether telehealth intakes could be pooled with a partner FQHC (pooling reduces the effective variability per the Erlang-C multi-server model).
Emergency services deserves monitoring. Utilization is in the yellow zone at 82%, but the high arrival variability (CV 1.1) means the effective operating point is closer to the knee than 82% suggests. The P90/P50 ratio is 94/38 = 2.5, within the healthy range, so the tail is not yet growing. Bed turnover is approaching binding status — if it binds, boarding will begin. Dr. Vasquez flags this for the weekly operations meeting but takes no immediate action. If the CV trends upward or utilization crosses 85% next week, she will escalate.
Total review time: ten minutes. One service line identified for action. Two confirmed as stable. Zero time spent on vanity metrics or contextless numbers. The dashboard earns its screen space because every metric on it connects to a mechanism, and every mechanism connects to an intervention.
Product Design Spec
A dashboard implementing this metric set must follow the progressive disclosure principles described in Human Factors Module 6 (Product Design). The seven metrics across service lines are the top layer. Each metric cell is a click-through to the underlying data: the utilization-delay curve with the current position marked, the wait-time distribution histogram, the throughput trend, the abandonment funnel by stage, the project network with critical path highlighted, the constraint ranking, the variability trend.
Cognitive load constraint: No more than 21 cells visible on the primary view (7 metrics x 3 service lines). Color coding carries the urgency signal — the operator’s eye goes to red first. Numbers provide precision for the cells that earned attention through color.
Alert logic: Alerts fire on threshold breaches, not on metric values. A utilization of 88% is not inherently an alert — but 88% sustained for 2 weeks in a service line with CV above 0.7 is. Alerts must include the metric, the threshold, and the suggested intervention class. “Behavioral health utilization has exceeded 85% for 3 weeks with high variability. Consider capacity addition or variability reduction.” This is the minimum viable OR intelligence: not just what is wrong, but what category of response the theory supports.
Update cadence: Utilization, wait time, throughput, and abandonment update daily. Constraint identification updates weekly or on significant staffing/resource changes. Critical path status updates on every task completion or delay. Variability recalculates weekly from trailing 4-week data.
Integration Hooks
Human Factors Module 6 (Product Design): The dashboard design must follow progressive disclosure — summary view first, detail on demand. Cognitive load management requires limiting the primary view to the seven metrics, not expanding it with “nice-to-have” secondary indicators. Trust calibration means the alert logic must have a low false-positive rate; an operator who receives three false alerts in a week will stop reading alerts in week four. Every design choice in this dashboard is jointly an OR question (which metric?) and a human factors question (how to display it so the operator actually uses it?).
All OR Modules (1-7): This page synthesizes the metric implications from every preceding module. Utilization and wait time from Module 2. Throughput from Modules 2 and 5. Abandonment from Module 2. Critical path from Module 4. Constraint identification from Module 3. Variability from Module 1. The healthcare applications in Module 7 (ED flow, behavioral health access, surgical scheduling) are the settings where these metrics are measured. Module 6 (simulation) provides the analytical engine for the “what-if” scenarios that make the dashboard predictive rather than merely descriptive.
Public Finance Module 8 (Grant Product Design): Grant program dashboards should surface the same metric architecture: critical path status for milestone tracking, burn rate as a utilization analog, compliance staff constraint identification, and abandonment rate on program enrollment as the invisible demand signal.