Appointment Scheduling Models
Module 5: Scheduling and Sequencing Depth: Application | Target: ~2,000 words
Thesis: Appointment scheduling models must balance provider utilization, patient access, and wait-time variability — and most clinic schedules optimize only the first.
The Three-Way Tradeoff
Every appointment template serves three masters, and they are in tension:
- Provider utilization — minimize idle time between patients. A provider with gaps in their schedule costs salary without producing revenue. This is the metric administrators see first.
- Patient access — minimize days-to-appointment. A patient who calls today and is offered an appointment in six weeks has an access problem regardless of how efficiently that future visit is scheduled.
- Patient experience — minimize day-of waiting. A patient who arrives at 10:00 for a 10:00 appointment and is seen at 10:42 has absorbed variability that the scheduling template failed to account for.
These objectives conflict. Maximizing provider utilization means packing the schedule tight — no slack, no buffer, every slot filled. But tight packing amplifies the effect of service-time variability: one complex patient cascades into delays for every patient behind them. High utilization also means fewer available slots, which pushes days-to-appointment outward. Conversely, loose scheduling improves day-of experience and near-term access but wastes provider capacity that a resource-constrained clinic cannot afford.
Most clinic scheduling templates resolve this tension implicitly, by optimizing provider utilization and treating the other two as externalities. The template fills the day. Patient access becomes a byproduct of how many days out the schedule fills. Day-of waiting becomes the patient’s problem. This is not a design choice — it is the absence of one. The scheduling models that follow make the tradeoff explicit and tunable.
The Model Landscape
Individual-Block Scheduling
The simplest model: assign each patient a fixed time slot. Fifteen-minute slots, twenty-minute slots, whatever the template dictates. Every patient gets one block. The schedule is a grid.
This is the dominant model in US outpatient care, and it fails predictably. It assumes every visit takes the same amount of time — a fifteen-minute acute follow-up and a forty-five-minute new-patient behavioral health assessment get the same slot, or at best slots of two different sizes that ignore the variance within each type. When a visit runs long, every subsequent patient waits. When a visit runs short, the provider sits idle until the next slot. Individual-block scheduling converts service-time variability into either patient delay or provider idle time with no mechanism to absorb either.
Wave Scheduling (Bailey-Welch, 1952)
In 1952, Norman Bailey and John Welch published the foundational paper on outpatient appointment scheduling. Their core insight was that the first slot of the session creates the dominant dynamic: if the first patient is a no-show or arrives late, the provider sits idle with no queue to fall back on. Their solution — now called the Bailey-Welch rule — is to schedule two patients at the start of each session and one patient per slot thereafter.
The logic is probabilistic. With two patients scheduled at the opening, the probability that the provider begins idle drops from the no-show rate (say 18%) to the probability that both no-show (roughly 3.2%, assuming independence). The provider begins the session with a patient to see almost certainly, and the second patient provides a buffer that absorbs early variability.
Wave scheduling extends this principle across the session. Instead of one patient per slot, schedule a “wave” of patients (typically 3-4) at the top of each hour, then let them flow through during the hour. The provider is never idle because there is always someone waiting. Patients absorb more day-of wait time, but the provider’s schedule is protected from gaps.
The tradeoff is naked: wave scheduling optimizes provider utilization by transferring variability onto patients. It is honest about doing so, which is more than individual-block scheduling can claim — but it systematically degrades patient experience.
Modified Wave Scheduling
A compromise. Schedule 2-3 patients at the start of each hour-block and stagger the remainder at intervals within the block. For example: two patients at 9:00, one at 9:20, one at 9:40. This retains the opening-buffer logic of Bailey-Welch while limiting the maximum time any individual patient waits.
Modified wave is the most common “improved” template in primary care. It works better than pure wave or pure individual-block, but it is still a heuristic — it does not adapt to the actual service-time distribution, the actual no-show rate, or the actual mix of visit types. It is a better guess, not a model.
Open Access / Same-Day Scheduling (Mark Murray)
In the late 1990s, Mark Murray and Catherine Tantau proposed a radical alternative: do today’s work today. Instead of scheduling patients weeks out, hold the majority of slots open for same-day requests. The principle, developed through Murray’s work with the Institute for Healthcare Improvement (IHI), attacks the access problem directly by eliminating the backlog.
The mechanism is counterintuitive. Most scheduling backlogs are self-reinforcing: when patients cannot get timely appointments, they book further out “just in case,” inflating apparent demand. They also book at multiple locations or reschedule repeatedly, creating phantom demand. Murray’s insight was that much of the apparent demand overload is an artifact of the scheduling system itself. Eliminating the backlog reduces total booking volume because it eliminates the rebooking, no-showing, and hedging that backlogs generate.
Open access works — Murray demonstrated substantial reductions in days-to-third-next-available across multiple sites — but it has prerequisites that many clinics fail to establish. It requires working down the existing backlog before switching (a painful transition period). It requires sufficient daily capacity to meet daily demand (if true demand exceeds capacity, open access just makes the shortage visible faster). And it requires discipline: the temptation to pre-book “just a few” patients erodes the model within months.
Open access also introduces a new variability problem. When you do not know who is coming until the morning of, the appointment-type mix is unpredictable. A day that happens to draw five complex chronic-disease visits and two behavioral health assessments will run very differently from a day of acute visits. This connects directly to the service-time variability problem and the Pollaczek-Khinchine formula from Module 2: uncontrolled variance in service times amplifies day-of delays even when average utilization is appropriate.
Cluster Scheduling
Group similar appointment types into dedicated time blocks. Chronic disease management on Tuesday mornings. Well-child visits on Wednesday afternoons. Behavioral health intakes on Thursday. Within each cluster, service-time variability is lower because the visit types are homogeneous — the coefficient of variation (c_s) drops, and by the P-K formula, expected queue length drops proportionally.
Cluster scheduling also enables better resource matching: the behavioral health cluster can be staffed with the integrated BH consultant present, the chronic-disease cluster can have the care coordinator available, and the well-child cluster can run with MA-heavy support. This connects to Workforce Module 3 (Role Architecture): the appointment model and the staffing model are coupled systems, and cluster scheduling makes that coupling explicit and manageable.
The weakness: cluster scheduling reduces flexibility. A patient with an acute need on a chronic-disease morning must either be squeezed in (disrupting the cluster) or deferred (reducing access). It trades within-day variability reduction for between-day access rigidity.
The No-Show Problem
No-shows are the single largest source of wasted capacity in outpatient care. National averages range from 15-30% depending on setting, with Federally Qualified Health Centers (FQHCs) and behavioral health clinics frequently reporting rates at the upper end. Cayirli and Veral’s comprehensive 2003 review of appointment scheduling research identified no-show management as the most impactful lever in scheduling design.
Overbooking as Probabilistic Calibration
The naive response to no-shows is overbooking: if 20% of patients no-show, schedule 125% of capacity. This works on average and fails on specific days. When everyone shows up, the clinic is overwhelmed. When no-shows exceed expectations, providers sit idle. Uniform overbooking treats a stochastic problem as deterministic.
Calibrated overbooking uses the no-show probability to set slot counts that optimize expected outcomes. If a provider has 16 slots and the no-show rate is 18%, the expected number of arrivals from 16 appointments is 13.1 — the provider is systematically underutilized. Scheduling 19 appointments yields an expected 15.6 arrivals, closer to capacity. But the variance matters: the probability that all 19 show is small (roughly 0.82^19 ≈ 1.8% under independence), while the probability that 17 or more show — creating an overflow — is material.
The Bailey-Welch rule is a special case of overbooking: schedule an extra patient at the session start, where the cost of provider idle time is highest. Gupta and Denton’s 2008 review in Annals of Operations Research formalizes this as a sequential decision problem and shows that optimal overbooking rules depend on the patient-specific no-show probability, the cost ratio of provider idle time to patient waiting time, and the position of the appointment within the session (late-session overbooking is riskier because overflow cannot be absorbed).
The practical implication: overbooking should not be uniform across the day. It should be front-loaded (protecting session starts) and calibrated to segment-specific no-show rates (new patients no-show at different rates than established patients; Monday mornings differ from Friday afternoons). This is a probabilistic optimization, and doing it well requires data that most scheduling systems collect but do not use.
Appointment Type Mix and Service-Time Variability
A scheduling template that assigns uniform slot lengths to heterogeneous visit types creates systematic bottlenecks. Consider three common visit types in primary care:
- Acute visit: mean duration 14 minutes, standard deviation 5 minutes (c_s = 0.36)
- Chronic disease management: mean duration 28 minutes, standard deviation 10 minutes (c_s = 0.36)
- Behavioral health integration: mean duration 42 minutes, standard deviation 15 minutes (c_s = 0.36)
Even with identical coefficients of variation, the mean durations differ by 3x. A template built on 20-minute slots will systematically underallocate time for chronic and behavioral visits and overallocate for acute visits. The chronic visits run over, cascading delays downstream. The acute visits finish early, creating idle gaps that cannot be recaptured.
When visit types are mixed randomly across the day — the default in most scheduling systems — the effective service-time distribution becomes a mixture distribution with high overall variance. Even if each type individually has a modest c_s, the mixture inflates the aggregate c_s because the mean shifts between types. The P-K formula from Module 2 makes the cost explicit: queue length scales with (1 + c_s²), and a mixture distribution can easily push c_s² above 1.5, tripling expected delays relative to a homogeneous schedule.
This is the analytical basis for cluster scheduling and for templates with variable-length slots matched to appointment type. It is not a preference — it is a mathematical consequence of mixing service-time distributions.
Healthcare Example: FQHC Template Redesign
Setting: A Federally Qualified Health Center with 4 providers (2 physicians, 1 nurse practitioner, 1 behavioral health consultant). Current template: uniform 20-minute slots, 18 slots per provider per day, 72 total daily slots across all providers.
Current performance:
- No-show rate: 18% (measured over 6 months)
- Visit type mix: 50% acute (mean 14 min), 35% chronic (mean 28 min), 15% behavioral health (mean 42 min)
- Weighted average visit duration: 0.50(14) + 0.35(28) + 0.15(42) = 7.0 + 9.8 + 6.3 = 23.1 minutes
- Expected arrivals per day: 72 × 0.82 = 59 patients
- Effective daily demand: 59 × 23.1 min = 1,363 minutes
- Available daily capacity: 4 providers × 8 hours × 60 min = 1,920 minutes (gross), roughly 1,680 minutes net of admin time
- Effective utilization: 1,363 / 1,680 = 0.81
At 81% utilization, the system appears healthy. But the uniform 20-minute slot creates systematic problems. Chronic visits overflow their slots by 8 minutes on average. Behavioral health visits overflow by 22 minutes. These overflows cascade: by 2:00 PM, providers running chronic-heavy afternoons are 30-45 minutes behind. Meanwhile, providers with acute-heavy mornings have idle gaps they cannot fill. Day-of patient wait averages 26 minutes; 90th-percentile wait exceeds 50 minutes.
Third-next-available appointment (the IHI standard access metric, measuring the third available open slot for a new-patient appointment) averages 12 days — acceptable but not meeting the open-access target of same-day or next-day.
Template redesign using scheduling model principles:
-
Variable slot lengths matched to visit type. Replace uniform 20-minute slots with 15-minute acute slots, 30-minute chronic slots, and 45-minute behavioral health slots. This aligns template structure with the actual service-time distribution.
-
Cluster by visit type. Mornings: acute and chronic mix (physician and NP). Afternoons: chronic disease block (Tuesday/Thursday) and behavioral health block (Monday/Wednesday/Friday, with BH consultant co-located). This reduces within-session service-time variability.
-
Calibrated overbooking. At an 18% no-show rate, overbook 2 acute slots per provider in the morning session (where idle time cost is highest, and acute visits have the shortest overflow risk). Do not overbook behavioral health slots (overflow duration is high and the cost of patient overtime in a BH visit is clinical, not just operational).
-
Modified wave opening. Schedule 2 patients at the start of each half-day session per the Bailey-Welch rule. Stagger remaining slots at intervals matched to expected visit duration.
-
Hold 20% of acute slots for same-day access. Following Murray’s open-access principle, protect a portion of short-duration slots for day-of requests, reducing days-to-appointment for the most common visit type without disrupting the template structure for longer visits.
Projected performance:
- Provider idle time drops by roughly 25% (fewer gaps from acute visits finishing early, fewer cascading delays from chronic/BH overflows)
- Day-of patient wait drops from 26-minute average to approximately 15 minutes (reduced service-time variability within sessions)
- Third-next-available drops from 12 days to 3-4 days (same-day acute slots absorb the highest-volume visit type)
- Effective utilization rises to approximately 84% despite fewer total slot-counts, because the slots are better matched to actual service times
The redesign does not add providers, rooms, or hours. It restructures the template to align with the statistical properties of the demand it serves.
Third-Next-Available: The Standard Access Metric
The Institute for Healthcare Improvement established third-next-available appointment as the standard access metric for outpatient care. It measures the number of days until the third available open appointment for a given provider or practice. The “third” is deliberate — it avoids the noise of a single open slot (which may be an anomalous gap) and approximates the true availability a patient would encounter when calling.
Strengths: It is simple, measurable, and directly patient-relevant. It captures the cumulative effect of capacity, demand, and scheduling design in a single number. It trends well — a practice can track third-next-available weekly and detect access degradation before patients complain.
Limitations: It does not capture day-of wait time at all. A practice with third-next-available of 1 day but 45-minute lobby waits has an access metric that looks excellent while patient experience is poor. It also does not distinguish between visit types — a practice might have same-day acute access and 30-day waits for behavioral health, but the aggregate metric obscures this. Finally, it is gameable: holding slots open artificially (without backfilling cancellations) improves the metric without improving actual access.
The correct approach is to track third-next-available stratified by visit type and paired with day-of cycle time. Together, these metrics capture both access dimensions: how long to get in the door, and how long you wait once you are there.
Integration Points
Human Factors Module 2 — Schedule-Induced Fatigue. Appointment templates do not just allocate time — they allocate cognitive load. A provider who sees four complex chronic disease visits back-to-back, each requiring medication reconciliation, care plan updates, and shared decision-making, experiences cumulative cognitive depletion that a schedule of alternating acute and chronic visits would avoid. Cluster scheduling improves service-time homogeneity (reducing the P-K variability term) but can worsen cognitive clustering. Template design must balance statistical efficiency against fatigue dynamics: interleaving one or two shorter, lower-complexity visits within a chronic-disease block restores cognitive recovery time without significantly increasing aggregate service-time variability. The scheduling model and the fatigue model must be co-optimized.
Workforce Module 3 — Role Architecture. Appointment models must account for provider type mix. An MD, an APP (advanced practice provider), and a behavioral health consultant have different scope, different service-time distributions, and different cost structures. A template that treats all four providers in the FQHC example identically ignores that the BH consultant cannot see acute visits and the NP may have longer chronic-visit durations than the physicians. Effective template design is role-aware: it assigns visit types to provider types based on scope-of-practice fit and service-time characteristics, not just availability. This is a joint optimization over the scheduling model and the role architecture — neither can be designed well in isolation.
Product Owner Lens
What is the operational problem? Clinic schedules waste capacity (provider idle time from mismatched slot lengths), degrade access (backlogs from templates that do not reserve same-day slots), and create poor patient experience (cascading delays from service-time variability) — simultaneously.
What mechanism explains it? Service-time variability interacts with rigid templates to produce systematic over- and under-allocation. No-shows create stochastic capacity loss that uniform scheduling cannot absorb. The P-K formula quantifies the cost: queue length scales with service-time variance.
What intervention levers exist? Variable slot lengths matched to visit types. Cluster scheduling to reduce within-session variance. Calibrated overbooking using segment-specific no-show rates. Same-day slot reserves for acute access. Modified wave openings to protect session starts.
What should software surface? (1) No-show rates by visit type, day-of-week, and patient segment — the inputs to overbooking calibration. (2) Actual vs. scheduled visit duration by appointment type — to validate and adjust slot lengths. (3) Third-next-available by visit type, trended weekly. (4) Day-of cycle time (arrival to departure) as the patient-experience complement to access metrics. (5) Provider idle-time percentage and cascade-delay frequency as template efficiency indicators.
What metric reveals degradation earliest? The gap between scheduled slot length and actual visit duration, trended over time. When chronic visits consistently overflow 30-minute slots by 8+ minutes, the template is misaligned with reality. This misalignment is measurable weeks before it manifests as patient complaints or provider burnout. The secondary indicator: rising no-show rates in specific visit types or time-of-day slots, which signal that patients are self-selecting out of a schedule that does not serve them.
Warning Signs
- Third-next-available looks fine but day-of waits are growing. The template has adequate slot counts but mismatched slot durations. Access is not the problem; flow is.
- No-show rate differs sharply by visit type but overbooking is uniform. The scheduling system is applying a single correction factor to a heterogeneous problem.
- Providers report “running behind” by mid-afternoon but morning sessions finish early. Service-time variability is unevenly distributed across the day. The template does not match the demand profile to the time-of-day.
- Same-day requests are handled by “squeezing in” rather than by reserved slots. The template was not designed for same-day access; it is being patched daily by front-desk staff absorbing the variability that the model should have anticipated.
- Behavioral health visits are scheduled in the same template as medical visits. The 3x duration difference creates mixture-distribution variance that cascading delays across the entire afternoon.
Summary
Appointment scheduling is the most direct intersection of operations research and daily clinic operations. The models — Bailey-Welch wave scheduling, Murray’s open access, cluster scheduling, calibrated overbooking — are not theoretical alternatives. They are design patterns, each optimizing a different point in the utilization-access-experience tradeoff space. The right model depends on the clinic’s visit-type mix, no-show rate, provider-type composition, and strategic priority.
What is not acceptable is the default: uniform slot lengths, no overbooking calibration, no same-day reserves, no measurement of the gap between scheduled and actual visit duration. That default optimizes nothing — it merely fills a grid and forces patients, providers, and front-desk staff to absorb the consequences of a template that was never designed for the demand it serves.