Go back
Published:
· queueing-theory / math / operations-research

Queueing Theory: Why Is the Airport Line So Slow?

Why is the line at the airport so slow, and what queueing theory has to say about it.

I was at the airport recently, watching a check-in queue do what check-in queues do, which is mostly not move. There were four agents working, but each had their own little line in front of them. The lines were moving at completely different speeds. Someone who joined a different queue was checked in and gone before I had taken three steps forward. By the time I got to the counter I was curious. Why is it set up this way? Could I, the smug person standing in line with nothing to do, have designed it better?

It turns out the answer is yes, almost certainly. There is a whole field of maths called queueing theory that has been thinking about this since the early 1900s, originally for telephone exchanges, and the punchlines are surprisingly useful. This post is a short tour. By the end you’ll know why one long line is almost always better than several short ones, why a system that runs at 90% capacity is dramatically worse than one at 80%, and why a small amount of randomness in service times causes much more pain than you’d guess.

A few short interactive toys along the way; drag the sliders and see what happens.

Table of contents

Open Table of contents

Anatomy of a queue

Before we can analyse anything, we need to name the parts. Queueing theorists use a compact notation: A / S / s / K / N.

The most famous queue, the one everyone starts with, is the M/M/1:

“Memoryless” deserves a footnote. If your average inter-arrival time is 4 minutes and no one has arrived for 3 minutes, your expected wait for the next person is still 4 minutes, not 1. Same goes for service. The fact that the person ahead of you has been at the counter for ages tells you nothing about how much longer they will take. This is wrong in everyday life, where waits feel “due” to end, but it is exactly right when arrivals and services are independent random events.

Here’s an M/M/1 in motion. Try it out.

The two knobs (and the cliff)

Every queue has two knobs. The arrival rate λ (lambda) is how fast customers show up. The service rate μ (mu) is how fast a server can clear them. The ratio of these, with the number of servers thrown in, is the most important number in queueing theory:

ρ = λ / (s · μ)

ρ is the utilisation: the fraction of your service capacity being used. If ρ = 0.5, your servers are busy half the time. If ρ = 0.9, they are slammed. If ρ ≥ 1, you have arrivals coming in faster than you can ever clear them and the line grows without bound forever. (At a real airport, the line stops growing because the airport closes, or because customers give up, but the maths says: not pretty.)

Here’s the part that surprised me. As you push ρ toward 1, the expected number of customers in the system does not grow linearly. It grows on a curve that gets vertical very fast. The M/M/1 formula is short:

L = ρ / (1 − ρ)

Plug in some numbers. At ρ = 0.5, L = 1; at ρ = 0.8, L = 4; at ρ = 0.9, L = 9; at ρ = 0.95, L = 19. The cost of running near full capacity is brutal and almost everyone underestimates it.

A real-world consequence worth burning in: a system running at 90% feels qualitatively different from one running at 80%, even though it’s “only 10% more loaded”.

One line or many?

Back to my airline counter. Four agents, four lines, all moving at different speeds because the customers in them take different amounts of time. The maths of why this setup is worse than it could be is short.

Imagine a simplified version. You have three agents and customers arriving at a total rate of 2.4 per minute. Each agent can serve 1 customer per minute. Two layouts:

Pooling the queue cuts the average number of waiting customers by roughly 4x, even though nothing about the agents or the customers has changed.

The intuition: when each agent has their own line, an agent who happens to get a string of fast customers can be sitting idle while another agent is buried, wasting capacity. With one shared line, no agent is ever idle while someone is waiting. It is the single biggest practical insight in this whole field, and it’s why every well-designed bank, every TSA checkpoint, and every In-N-Out drive-thru uses a single feeder line.

(Some airlines still do not. Now you know.)

Why TSA gets this right

The TSA checkpoint is the canonical real-world example. With separate counters, each lane behaves like its own M/M/1, statistically independent of the others. With one feeder line, the same agents become servers in a shared M/M/s queue, with the same arrival rate, the same per-agent service rate, and the same number of agents. Only the topology of the queue has changed.

Separate lines lose the correlation between local supply and local demand. An agent who just finished a quick customer is idle until the next person joins their lane, even if someone is waiting two metres away. With one feeder, every agent who frees up takes the head of the shared queue immediately, so no one is idle while anyone is waiting. For the numbers above (λ = 2.4, μ = 1, three agents), separate lines have about 9.6 people waiting on average; one feeder has about 2.6.

Three layouts side by side: random-pick separate lines, join-the-shortest, and the single feeder.

Variability is the silent killer

One more insight, and it might be the most counterintuitive one. Suppose I told you that two coffee shops have the exact same average service time of 5 minutes. The first one is consistent: every customer takes between 4 and 6 minutes. The second is wild: most customers take 1 minute, but every so often someone orders a flight of pour-overs and takes 25.

Same average, and both shops can keep up with the same arrival rate in the long run. But the queues in the second shop will be enormously longer.

The maths comes from something called the Pollaczek-Khintchine formula, which says (for a single-server queue):

L_q ∝ (1 + variance term) / (1 − ρ)

Translation: the more variable your service times, the longer your queues, even at the same average rate. A purely deterministic system (everyone takes exactly the same time) has roughly half the expected queue length of an exponentially distributed one. And the exponential is itself only “moderately” variable. With heavy-tailed service times (a few customers who take forever), it can get much worse.

Intuitively, one slow customer creates a wake of waiting customers behind them, and that wake takes a while to clear. Then another slow one comes along and does it again. The bad events have a long memory; the good ones (fast customers) don’t help nearly as much.

Back to the airport

So back to my airline counter. Knowing what I now know, here’s what was happening:

The fix is obvious, and it’s the thing TSA figured out years ago: one feeder line that fans out to whichever agent is free next. It costs nothing extra and it would have shortened my wait by quite a bit. The airline still hasn’t done it.

If you take one thing from this post: when you’re designing or running any system that has waiting (call centres, support tickets, kitchen tickets, hospital triage, code review), the cheapest single improvement is almost always to pool the queue. The second cheapest is to reduce variability in how long the work takes. And the third is to back off ρ from 1, because you cannot run any line at the edge of its capacity and expect it to feel anything but miserable.

Next time you’re in line for too long, you have a new hobby. You’ll be amazed how often the answer is “they didn’t pool the queue.”


Tagged