# Queuing Theory: Simple Definition, Notation and Terminology

## What is Queuing Theory?

Queuing theory is the study of queues and the random processes that characterize them. It deals with making mathematical sense of real-life scenarios. For example, a mob of people queuing up at a bank or the tasks queuing up on your computer’s back end.

In queuing theory we often want to find out how long wait times or queue lengths are, and we can use models to do this. These models are typically important in business and software applications, and queueing theory is often considered a part of operations research.

Any queuing activity can be summarized as entities (customers in your supermarket queue, or jobs in a computer queue) trying to get through an activity (waiting to be served). Queues happen when we can’t all access the activity at the same time: when it is not economically efficient to have enough checkout lines for everyone to go right through as soon as they were ready, or there isn’t enough server space to do an unlimited amount of computer tasks at one moment.

In queueing theory a queue does not refer simply to a neat row which is always first come, first served. This is one example of a queue, but not the only kind. A mob trying to rush for the door on Black Friday is considered a queue as well, as is a group of job applicants waiting for interviews who are picked randomly, one by one, to be interviewed.

## Types of Queues and Types of Service

First In First Out, or First Come First Served, is fairly common in banking and commerce. It is the type of queue you get when you have people politely lined up, waiting for their turn.

Last In First Out is the opposite scheme; whoever has been waiting for the shortest time is served first. This type of queue management is common in asset management, where assets produced or acquired last are the ones used or disposed of first. For example: the most recent employees are often the ones laid off first.

Priority is where customers are served based on their priority level; these levels could be based on status, task urgency, or some other criteria.

Shortest Job First is when whoever needs the shortest amount of service gets taken care of first

Processor Sharing is when everyone gets served, or half-served, at the same time; service capacity is distributed evenly among everyone waiting.

There may be a single server, where a line of people or items must go through a single bottleneck, or parallel servers, where the same line is served by several servers.  Or there may be a tandem queue, where each of multiple servers has their own queue or line.

Balking when a customer decides not to wait for service because the wait time threatens to be too long. Reneging is similar, but when a customer who has waited already decides to leave because they’ve wasted too much time. Jockeying is when a customer switches between queues in a tandem queue system, trying to orchestrate the shortest wait possible.

## Standard Notation for Queueing Theory

To make life easier, there’s standard notation for queueing theory that is used across the board. These standard symbols include

• λ: the mean arrival rate.
• μ: the mean service rate.
• n: the number of people in the system.
• A: the arrival process probability distribution.
• B: the service process probability distribution.
• C: the number of servers.
• D: the maximum number of customers allowed in the system at any given time, waiting or being served (without getting bumped).
• E: the maximum number of customers total.

## References:

Operations Research Notes: Queuing Theory