Capacity Planning Tools

Module 8: Product Owner and Operator Synthesis Depth: Application | Target: ~2,000 words

Thesis: Capacity planning tools should embed queueing and simulation logic so that operators can test staffing scenarios without OR expertise.


The Operational Problem

Every health system does capacity planning. Most do it badly. The typical process begins in a spreadsheet: count current patients, project demand growth, divide by provider productivity targets, and request headcount to close the gap. This arithmetic produces a number. That number is wrong in a specific and predictable direction — it underestimates the resources needed to meet a given service level, sometimes dramatically. The error is not random. It is structural, rooted in the use of averages to plan for a system governed by variability.

The consequence is not abstract. A primary care network that capacity-plans with averages will budget for enough providers to handle expected volume — and then watch third-next-available appointments climb past 21 days while leadership insists the staffing model says capacity is adequate. An ED that plans bed capacity from average census will hit boarding crises every winter because the plan absorbed the mean but not the variance. A behavioral health program that staffs to average caseload will produce six-week intake waits because it failed to account for the nonlinear relationship between utilization and delay described in Module 2.

The fix is not better spreadsheets. It is tools that embed the mathematics of queueing and simulation — tools that accept demand with uncertainty, compute resource requirements at a specified service level, and show operators the tradeoff between staffing, utilization, and wait time. These tools exist in OR. They have existed since Agner Krarup Erlang derived his formulas for the Copenhagen Telephone Company in 1917. The problem is that they have not been made accessible to the people who make capacity decisions in healthcare.


What Capacity Planning Means in OR Terms

In operations research, capacity planning is the problem of determining how much resource — staff, beds, equipment, time — is needed to serve demand at a target service level, given that both demand and service times are variable. The formal statement matters because each element constrains the solution:

Demand is stochastic, not fixed. Patient arrivals follow probability distributions with means, variances, and temporal patterns (day-of-week effects, seasonal surges, post-holiday spikes). A capacity plan built on a point estimate of demand is a plan built on the one scenario that is guaranteed not to occur exactly.

Service times are variable. A 20-minute appointment slot is a scheduling fiction. Actual visit durations range from 8 to 45 minutes depending on patient complexity, and the variance in that distribution — not just the mean — determines system performance. The Pollaczek-Khinchine formula (Module 2) shows that for a single server, average queue length depends on both the mean and the squared coefficient of variation of service time.

The service level target is a constraint, not a wish. “Adequate” capacity means different things depending on the standard: 95% of patients seen within 30 minutes (ED), third-next-available appointment under 14 days (primary care, per IHI access metrics), or intake within 48 hours of referral (behavioral health). The capacity required to meet a 14-day standard versus a 7-day standard is not linearly related — the utilization-delay curve (Module 2) ensures that tighter service levels require disproportionately more capacity.

Resources interact. A provider who cannot see patients because lab results are delayed, or because the MA is rooming another patient, is constrained by the system — not by their own capacity. Multi-resource systems produce bottleneck-shifting behavior where the binding constraint moves depending on demand mix and resource availability. Capacity planning for a single resource in isolation ignores these interactions and systematically misdiagnoses where investment is needed.


Why Spreadsheet Capacity Planning Fails

The standard healthcare capacity spreadsheet commits a specific set of errors that map directly to concepts from earlier modules.

It uses averages. Sam Savage’s “flaw of averages” (Module 1) states that plans based on average inputs produce results that are wrong on average. A capacity model that uses average daily demand, average visit duration, and average provider productivity will understate the resources needed to meet any service level target — because it ignores the variability that creates queues. The expected value of a nonlinear function is not the function of the expected value. Average demand at average service rate produces average utilization, but average utilization tells you nothing about actual wait times when the utilization-delay relationship is hyperbolic.

It treats capacity as linear. Spreadsheet models assume that if 4 providers can see 80 patients, 5 providers can see 100. This is true for throughput — in the long run, more servers process more volume. But the question is not throughput. It is service level. And service level is nonlinear in capacity. Adding a fifth provider to a system at 90% utilization does not reduce waits by 20%. It reduces them by 50% or more, because it pulls the system back from the steep part of the utilization-delay curve. Conversely, losing one provider from a system at 75% does not just reduce capacity by 20% — it may push utilization past the knee and produce catastrophic wait-time degradation.

It cannot model resource interactions. A spreadsheet can compute provider slots, bed-days, and OR block hours independently. It cannot capture the dependency between them: that a discharged bed requires a physician discharge order, a nurse to process it, transport to clean the room, and registration to assign the next patient — a chain where the slowest link determines throughput. Modeling these interactions requires either network queueing models or simulation (Module 6).

It produces point estimates, not distributions. The spreadsheet says “we need 4.3 FTE.” The correct answer is “we need between 4 and 6 FTE, with 4 adequate 60% of the time and 5 adequate 90% of the time.” Without a probability distribution of outcomes, operators cannot make risk-informed decisions about how much safety margin to carry.


What an OR-Informed Capacity Planning Tool Must Do

A capacity planning tool worthy of the name must incorporate five capabilities that spreadsheets lack.

Accept demand forecasts with uncertainty ranges. Not “120 patients per day” but “120 patients per day with a standard deviation of 18, higher on Mondays (mean 138) and lower on Fridays (mean 104), with a 10% probability of a seasonal surge to 150+ during January-February.” This means the tool takes distributions as inputs, not point estimates.

Apply queueing models to translate demand into requirements. Given arrival rate, service time distribution, and a target service level, the tool should compute the number of servers needed. For staffing problems with Poisson arrivals and exponential service times, the Erlang-C formula provides the exact answer: the probability that an arriving customer waits, as a function of offered load and number of servers. For bed planning with a focus on blocking probability (turning patients away), Erlang-B applies. For more general service distributions, the Halfin-Whitt heavy-traffic regime provides approximations that are accurate precisely in the range most healthcare systems operate — high utilization with many servers.

Display the utilization-delay tradeoff explicitly. The output should not be a single number. It should be a curve: “At 4 providers, utilization is 88% and expected wait is 34 minutes. At 5 providers, utilization is 70% and expected wait is 9 minutes. At 4.5 FTE (one provider at 0.5 time), utilization is 78% and expected wait is 16 minutes.” This lets operators choose their operating point on the curve, making the cost-service tradeoff visible rather than hiding it behind an opaque recommendation.

Support what-if scenarios. The tool must answer: What happens if we add a provider? Lose a provider? If demand increases 20%? If we reduce visit-time variability by implementing pre-visit planning? Each scenario reruns the model and shows the new position on the utilization-delay curve. This is not a luxury feature — it is the core value proposition. Capacity planning without scenario analysis is not planning. It is forecasting.

Account for variability by showing outcome distributions. When Monte Carlo methods (Module 6) drive the engine, the tool can report not just expected wait time but the 90th percentile wait, the probability of exceeding a service-level threshold on any given day, and the frequency of queue overflow events. These distribution-level outputs let operators distinguish “adequate on average” from “adequate with confidence.”


The Erlang-C Staffing Calculator: The Simplest Useful Tool

The most accessible entry point for OR-informed capacity planning is the Erlang-C model. It solves a specific, common problem: given a known arrival rate (lambda) and average service time (1/mu), how many parallel servers (c) are needed to ensure that the probability of waiting exceeds a threshold no more than X% of the time?

Linda Green’s research at Columbia Business School applied Erlang-C systematically to hospital staffing, demonstrating that simple queueing models could expose severe staffing inadequacies invisible to ratio-based planning. Her work showed that nurse staffing ratios calculated from average census consistently understaffed night shifts and weekends, where arrival variability is proportionally higher relative to capacity. Green and Kolesar’s “lagged Erlang-C” adaptation further extended the model to time-varying demand — a critical refinement, since healthcare demand is anything but stationary.

The Erlang-C calculation is computationally trivial. It can run in a browser. The inputs are:

  • Arrival rate (patients per hour, referrals per day)
  • Average service time (minutes per visit, hours per bed-stay)
  • Number of servers (providers, beds, schedulers)
  • Target service level (e.g., 80% of patients wait less than 20 minutes)

The outputs are:

  • Probability that an arriving patient waits (P_wait)
  • Expected wait time for those who wait
  • Expected wait time overall (including those served immediately)
  • Server utilization

This is enough to expose the fundamental mismatch between staffing levels and service-level targets. A clinic director who learns that her 4-provider schedule, which looks reasonable on a headcount basis, actually produces a 62% probability of patient wait at peak demand — and that adding 0.5 FTE reduces that to 28% — has the information needed to make a resource case that no spreadsheet could generate.


When Erlang-C Is Not Enough

Erlang-C assumes Poisson arrivals, exponential service times, a single queue feeding identical servers, no abandonment, and stationary demand. Healthcare violates most of these, and operators must know when the violations matter enough to require more sophisticated methods.

Multiple resource types. A patient who needs a provider, a room, and a lab draw requires three resources in sequence. Erlang-C models each in isolation. The actual bottleneck may be the interaction between them — provider available but no room, room available but lab backed up. This requires network queueing models or discrete-event simulation (Module 6).

Complex routing. Patients in an ED follow different pathways depending on acuity: fast-track for ESI-4/5, main ED for ESI-2/3, trauma bay for ESI-1. Each pathway has different resource requirements and service times. A single Erlang-C model cannot capture this heterogeneity; it requires either separate models per pathway (with the caveat that shared resources couple them) or simulation.

Time-varying demand. Demand that fluctuates within the day — the classic ED arrival pattern peaking at 11am and 7pm, per Welch et al. — violates the stationarity assumption. Green and Kolesar’s stationary independent period-by-period (SIPP) approach handles this by applying Erlang-C separately to each time block, but the method can understate staffing needs during transition periods when queues built in one period carry into the next. The Halfin-Whitt regime provides more accurate approximations for large-scale systems (50+ servers), while simulation handles arbitrarily complex time-varying patterns.

Abandonment. Patients who leave without being seen, or who call a different clinic when told the wait is too long, violate Erlang-C’s assumption of infinite patience. The Erlang-A model (with “A” for abandonment, developed by Palm and extended by Mandelbaum and Zeltyn) incorporates a patience distribution and is more appropriate for settings with observable abandonment — EDs, call centers, behavioral health intake queues.

The decision rule is pragmatic: start with Erlang-C. If it reveals that you are deep in the flat part of the curve (utilization under 70%), refinements will not change the answer. If you are near the knee — utilization between 78% and 92%, where most healthcare systems live — and the decision is sensitive to the exact staffing number, invest in simulation.


Healthcare Example: Three-Site Primary Care Network

Cascade Health Partners operates three primary care clinics serving a mixed rural-suburban population in the Pacific Northwest. Heading into fiscal year planning, leadership proposes hiring a full-time physician for Site C, where patients have complained loudly about wait times. The proposed cost: $340,000 fully loaded. The operations team runs the decision through an OR-informed capacity planning tool before committing.

Site A — Happy Valley Clinic (4 providers, 2 MAs). The tool ingests 12 months of scheduling data: average 67 visits/day across 4 providers, with daily standard deviation of 11. Visit durations average 22 minutes (SD 9 minutes). Erlang-C at peak demand (Monday, mean 78 visits) shows utilization at 91% with expected wait of 38 minutes and third-next-available trending at 18 days. The model shows that adding 0.5 FTE (one PA, three days per week) drops peak utilization to 82% and third-next-available to 11 days — below the IHI-recommended 14-day threshold. Investment needed: $85,000 (0.5 FTE physician assistant).

Site B — River Road Clinic (3 providers, 2 MAs). Same analysis: 48 visits/day average, SD 8. Utilization at 79%. Expected wait times are moderate and third-next-available is at 12 days — within target. But the tool flags an anomaly: Wednesday afternoon utilization spikes to 94% while Friday morning runs at 58%. This is not a capacity problem. It is a scheduling template problem (Module 5). One provider blocks all procedures on Wednesday afternoon while Friday morning is held open for same-day access that rarely materializes. Rebalancing the template — a zero-cost intervention — would smooth utilization to 76-82% across all sessions and eliminate the Wednesday bottleneck. Investment needed: $0. Intervention: scheduling template redesign.

Site C — Mountain View Clinic (2 providers, 1 MA). Leadership’s nominated problem site. Average 34 visits/day, SD 6. Provider utilization at 80% — not on the steep part of the curve. Third-next-available at 16 days. The tool digs deeper. The simulation module (needed here because of multi-resource interactions) reveals that 22% of visits require same-day lab work, and the single MA processes labs, rooms patients, and manages phone triage. On days with high lab volume, the MA becomes the binding constraint: providers wait an average of 7 minutes between patients for rooming, creating effective idle time that reduces realized throughput by 15%. The bottleneck is not provider capacity. It is the single-MA resource. Adding a physician to Site C would increase theoretical capacity but would not resolve the MA bottleneck — the new physician would spend 20% of their time waiting for rooming, producing an effective capacity gain of 0.8 FTE for a 1.0 FTE cost. The correct intervention: add a second MA (cost: $48,000) and monitor for 90 days. If third-next-available remains above target after relieving the MA bottleneck, then consider provider expansion.

Total recommended investment: $133,000 (0.5 FTE PA at Site A, scheduling redesign at Site B, 1.0 FTE MA at Site C) versus the originally proposed $340,000 for a physician at the wrong site. The capacity planning tool did not just save $207,000. It prevented an intervention that would have failed to solve the problem it was aimed at — because the problem at Site C was not provider capacity. It was a multi-resource bottleneck that no headcount ratio or volume-per-provider spreadsheet could diagnose.


Build, Buy, or Calibrate

Not every health system needs custom simulation software. The decision framework for tooling matches the complexity of the capacity problem.

Calibrated spreadsheet with Erlang-C. Sufficient for single-resource, single-queue problems: how many schedulers for a call center, how many therapists for an intake queue, how many beds for an observation unit. The Erlang-C formula can be implemented in Excel (or accessed through free online calculators). This handles 60% of healthcare capacity planning questions adequately, provided the user understands the assumptions. Cost: near zero. Risk: misapplication to multi-resource problems where it gives misleadingly precise wrong answers.

Commercial workforce management tools. Products from vendors like Kronos (now UKG), Qgenda, or healthcare-specific platforms often embed simplified queueing or regression-based models for staffing. These are appropriate when the organization needs ongoing, operationalized capacity planning integrated with scheduling and timekeeping — not one-off analyses. Evaluate whether the tool exposes the utilization-delay tradeoff and supports scenario analysis, or whether it is a sophisticated spreadsheet that produces point estimates with a polished interface. Many commercial tools still plan from averages.

Custom simulation. Warranted when the capacity problem involves multiple interacting resources, complex patient routing, time-varying demand, or system-level dynamics that Erlang-C cannot capture — the three-site primary care example above required simulation for Site C. Custom simulation requires OR expertise to build and validate but can be packaged into reusable tools for specific problem types (ED staffing, bed management, clinic scheduling). The build cost is justified when the same model type applies across multiple facilities or is rerun quarterly for ongoing planning.

The product owner’s question: Which of these should be embedded in operational software versus delivered as consulting? The answer follows a maturity curve. Erlang-C staffing calculators belong in every healthcare operations platform — they are computationally cheap, universally applicable, and immediately useful. Scenario analysis with Monte Carlo belongs in planning modules used quarterly or annually. Full discrete-event simulation remains a specialist tool, but its outputs (recommended staffing tables, bottleneck identification, service-level projections) should feed into the operational platform as decision-support data.


Warning Signs

The tool produces a single number. Any capacity planning output that says “you need 4.3 FTE” without a confidence interval, a service-level target, or a utilization-delay tradeoff display is hiding the uncertainty that drives the actual decision. Demand the distribution.

Planners use last year’s volume as next year’s forecast. Historical volume is a starting point, not a forecast. Demand trends, panel growth, payer-mix shifts, new service lines, and population health changes all modulate future volume. The capacity planning tool must accept scenario-based demand inputs, not just trailing averages.

The bottleneck is assumed rather than identified. Leadership “knows” the problem is provider capacity. The tool should test that assumption by modeling all resources in the pathway. The most expensive capacity planning error is solving the wrong bottleneck — adding providers when the constraint is rooms, MAs, or lab turnaround.

Utilization targets are set without reference to variability. A 90% utilization target is reasonable for a low-variability scheduled procedure suite and catastrophic for a high-variability walk-in clinic. The tool — or the operator using it — must set targets relative to the variability structure of the specific service, not from an industry benchmark that ignores distributional characteristics.


Integration Hooks

Module 2 (Queueing Theory and Wait-Time Dynamics): Queueing models are the computational engine inside capacity planning tools. The utilization-delay curve is not a background concept here — it is the primary output. Every capacity planning tool that shows “at X staff, expect Y wait” is rendering a point on the curve derived from Erlang-C, Erlang-A, or the Kingman approximation. The tool makes Module 2’s mathematics accessible to operators who will never write the formulas themselves.

Workforce Module 6 (Workforce Economics and Capacity Planning): Capacity planning does not end with “you need 0.5 FTE.” It must connect to workforce economics: What does 0.5 FTE cost in this market? What role type (physician, PA, NP) delivers the required service capacity at the lowest cost? What is the recruitment pipeline timeline — can you have the hire in place before the demand materializes? A capacity planning tool that computes resource requirements without incorporating cost, availability, and recruitment lag produces recommendations that are analytically correct and operationally useless. The queueing model determines how much capacity is needed; the workforce model determines how to procure it.


Product Owner Lens

What is the operational problem? Health systems make capacity investments using spreadsheet arithmetic that ignores variability, treats the utilization-delay relationship as linear, and cannot identify multi-resource bottlenecks — resulting in systematic under-investment at sites that need help and mis-investment at sites where the problem is not what it appears.

What mechanism explains it? The flaw of averages (Module 1) and the nonlinear utilization-delay relationship (Module 2) mean that average-based planning underestimates resource needs at any meaningful service-level target. Multi-resource interactions create bottleneck-shifting behavior invisible to single-resource models.

What intervention levers exist? Staffing (add/redistribute FTE), scheduling (rebalance templates), support staff (relieve non-provider bottlenecks), and demand smoothing (reduce arrival variability).

What should software surface? (1) An Erlang-C calculator embedded in the staffing module — arrival rate and service time in, staffing recommendation at target service level out, with the utilization-delay curve rendered visually. (2) Scenario comparison: side-by-side display of current state versus proposed changes, showing utilization, wait time, and service-level attainment for each scenario. (3) Bottleneck identification: for multi-resource pathways, flag which resource is the binding constraint — the one whose utilization is driving the system’s delay.

What metric reveals degradation earliest? The gap between scheduled capacity and demand-adjusted required capacity (from the queueing model) at the target service level. When scheduled FTE drops below the Erlang-C-recommended FTE for more than two consecutive scheduling periods, the system is entering a capacity deficit that will manifest as service-level failure within weeks — before the trailing wait-time metric registers the problem.


Summary

Capacity planning in healthcare is a queueing problem that is routinely solved as an arithmetic problem. The result is predictable: plans that look adequate on paper and fail in operation, investments directed at the wrong resource, and service-level targets that are set without understanding what they cost. An OR-informed capacity planning tool — even one as simple as an Erlang-C calculator with scenario comparison — closes the gap between the mathematics of congestion and the operational decisions that determine whether patients wait days or weeks. The tool does not require its users to understand queueing theory. It requires its builders to embed that theory so deeply that the operator’s experience is simply: enter demand, see options, choose a staffing plan with full knowledge of the tradeoffs.