Control of a Multi-Server Queue with Failures

One of my students was asked to analyze a queueing system at a mental healthcare system. There were 5 psychiatrists available for intakes of new patients. There were complaints about the time patients had to wait for their first intake. In other words, the waiting time in this queue was considered to be too long. To reduce this, the psychiatrists had various suggestions how to improve this system, but had not the tools to analyze the consequences of these suggestions. With the above simulator my student could help them out.

These were the suggestions.

  1. Not all psychiatrists had the same amount of time available per week to do intakes. This is not a problem during weeks that all are present. However, psychiatrists tend to take holidays, visit conferences, and so on. So, if the psychiatrist with the most intakes goes on leave, this might affect the queue length. What would be the influence of allocating the same amount of time for intakes for all psychiatrists? Comment: this system is an example of a queueing system with multiple servers with failures. Such queueing systems are exceedingly hard to analyze analytically.
  2. The psychiatrists tended to plan their holidays after each other, to reduce the variation in the service capacity. What if they would synchronize their holidays, to the extent possible, rather than spread their holidays?
  3. Finally, suppose the psychiatrists would do 2 more intakes per week in busy times and 2 less in quiet weeks. Assuming that the system is stable, i.e., the service capacity exceeds the demand, than on average the psychiatrists would not do more intakes, i.e., their workload would not increase, but the queue length may be controlled better.

We assume that slightly less than 12 new patients arrive per week.

import numpy as np
import matplotlib.pylab as plt
np.random.seed(3)
a = np.random.poisson(11.8, 1000)

We also need the recursion of Simulation Analysis of a single server queue.

def computeQ(a, C, Q0=0): #  initial queue length is 0
    N = len(a)
    Q = np.empty(N) # make a list to store the values of  Q
    d = np.empty(N) # make a list to store the values of  d
    Q[0] = Q0
    for n in range(1,N):
        d[n] = min( Q[n-1]+ a[n], C[n])
        Q[n] = Q[n-1]+ a[n] - d[n]
    return Q

Analysis of Suggestion 1 and 2

We make a matrix to model the service capacity of the psychiatrists. The first row, with index 0, models the amount of service of the first psychiatrist per week, and so on. Lets assume this

def unbalanced():
    p = np.empty([5,len(a)])
    p[0,:] = 1.*np.ones_like(a)
    p[1,:] = 1.*np.ones_like(a)
    p[2,:] = 1.*np.ones_like(a)
    p[3,:] = 3.*np.ones_like(a)
    p[4,:] = 9.*np.ones_like(a)
    return p

p = unbalanced()

Thus, psychiatrists 1, 2, and 3, do just one intake per week, and the fifth does 9.

We next need to implement their holidays. This is easily done by setting the capacity of a certain psychiatrist to 0 in a certain week. Lets assume that just one psychiatrist is on leave in a week, that they have one week per five weeks off, and that the psychiatrist’s holidays rotate. Some thought will show that the following code reflects these assumptions in the service.

for j in range(len(a)):
    Id = j%5
    p[Id,j] = 0

Hence, the total average capacity must be \(4/5 \cdot (1+1+1+3+9) = 12\) patients per week.

The total service capacity available in a week is the collumn sum of the matrix p.

s = np.sum(p,axis = 0)

Lets compute a sample path of the queue length for later purposes.

Qub= computeQ(a,s)

Now consider balanced service.

def balanced():
    p = np.empty([5,len(a)])
    p[0,:] = 2.*np.ones_like(a)
    p[1,:] = 2.*np.ones_like(a)
    p[2,:] = 3.*np.ones_like(a)
    p[3,:] = 4.*np.ones_like(a)
    p[4,:] = 4.*np.ones_like(a)
    return p

p = balanced()

and balanced holidays:

for j in range(len(a)):
    Id = j%5
    p[Id,j] = 0

s = np.sum(p,axis = 0)
Qbb= computeQ(a,s)

We also need unbalanced holiday plans. As an extreme case, send all the psychiatrists on holiday in the same week.

p = unbalanced()
for j in range(int(len(a)/5)):
     p[:,5*j] = 0
s = np.sum(p,axis = 0)
Quu= computeQ(a,s)

p = balanced()
for j in range(int(len(a)/5)):
     p[:,5*j] = 0
s = np.sum(p,axis = 0)
Qbu= computeQ(a,s)

plt.plot(Qbb, label="bal cap, bal leave")
plt.plot(Qub, label="unbal capacity, bal leave")
plt.plot(Qbu, label="bal cap, unbal leave")
plt.plot(Quu, label="unbal capacity, unbal leave")
plt.legend(loc="upper left")# , bbox_to_anchor=(1,1))
plt.show()
../../_images/multiServerWithFailures_figure9_1.png

Well, that was a lot of work, just to see that suggestions 1 and 2 do not significantly affect the behavior of the queue length process.

Suggestion 3

The third suggestion is do more intakes when it is busy, and do less when it is quiet. A simple rule to implement this is by considering last week’s queue \(q:=Q[n-1]\). If \(q<12\), i.e., the service capacity of one week, then do \(e\) intakes less. Here, \(e=1\), or 2, or perhaps a larger number; it corresponds to the amount of control we want to exercise. When \(q>24\), i.e., larger than two weeks of intakes, do \(e\) intakes more.

def computeQExtra(a, s, e, Q0=0): #  initial queue length is 0
    N = len(a)
    Q = [0]*N # make a list to store the values of  Q
    Q[0] = Q0
    for n in range(1,N):
        q = Q[n-1]
        if  q <12:
            s[n] = max(s[n]-e,0) # protect from becoming negative
        elif q >= 24:
            s[n] += e
        Q[n] = max( q +a[n] - s[n], 0)
    return Q

Let’s consider five different control levels, \(e=1\), \(e=2\), \(e=5\) (i.e, all psychiatrists do one extra intake). The previous sections showed that it is safe to disregard the holiday plans, so just assume a flat service capacity of 12 intakes a week.

a = np.random.poisson(11.8, 1000)
s = 12.*np.ones_like(a)
Q= computeQ(a,s)
Qe1= computeQExtra(a,s,1)
Qe2= computeQExtra(a,s,2)
Qe5= computeQExtra(a,s,5)

plt.ylim(0, 200)
plt.plot(Q, label="Q")
plt.plot(Qe1, label="Qe1")
plt.plot(Qe2, label="Qe2")
plt.plot(Qe5, label="Qe5")
plt.legend()
../../_images/multiServerWithFailures_figure11_1.png

The difference is striking indeed. The queue does no longer explode. Even taking \(e=1\) has a large influence. Moreover, doing just one extra intake a week appears to be quite manageable.

It is intesting to see the effect of the seed of the random variables.

np.random.seed(3)
a = np.random.poisson(11.8, 1000)
s = 12.*np.ones_like(a)
Q= computeQ(a,s)
Qe1= computeQExtra(a,s,1)
Qe2= computeQExtra(a,s,2)
Qe5= computeQExtra(a,s,5)

plt.ylim(0, 200)
plt.plot(Q, label="Q")
plt.plot(Qe1, label="Qe1")
plt.plot(Qe2, label="Qe2")
plt.plot(Qe5, label="Qe5")
plt.legend()
../../_images/multiServerWithFailures_figure12_1.png

This is, for me at least, quite unexpected, this difference is truely striking. What happens if we increase the simulation length by a factor 10?

np.random.seed(3)
a = np.random.poisson(11.8, 10000)
s = 12.*np.ones_like(a)
Q= computeQ(a,s)
Qe1= computeQExtra(a,s,1)
Qe2= computeQExtra(a,s,2)
Qe5= computeQExtra(a,s,5)

plt.ylim(0, 200)
plt.plot(Q, label="Q")
plt.plot(Qe1, label="Qe1")
plt.plot(Qe2, label="Qe2")
plt.plot(Qe5, label="Qe5")
plt.legend()
../../_images/multiServerWithFailures_figure13_1.png

So, we should not rely on too short simulations!

Some further experiments to see whether this policy works a well as it appears. If we assume that the psychiatrists still take holidays, than the service capacity becomes variable. Assume a quite dramatic type of variation, that is, Poisson services.

a = np.random.poisson(11.8, 10000)
s = np.random.poisson(12, 10000)
Q= computeQ(a,s)
Qe1= computeQExtra(a,s,1)
Qe2= computeQExtra(a,s,2)
Qe5= computeQExtra(a,s,5)

plt.ylim(0, 200)
plt.plot(Q, label="Q")
plt.plot(Qe1, label="Qe1")
plt.plot(Qe2, label="Qe2")
plt.plot(Qe5, label="Qe5")
plt.legend()
../../_images/multiServerWithFailures_figure14_1.png

The policy with \(e=1\) does not work that well in this case.

a = np.random.poisson(11.8, 10000)
s = np.random.poisson(12, 10000)
Q= computeQ(a,s)
Qe1= computeQExtra(a,s,1)
Qe2= computeQExtra(a,s,2)
Qe5= computeQExtra(a,s,5)

plt.ylim(0, 200)
plt.plot(Q, label="Q")
plt.plot(Qe1, label="Qe1")
plt.plot(Qe2, label="Qe2")
plt.plot(Qe5, label="Qe5")
plt.legend()
../../_images/multiServerWithFailures_figure15_1.png

What is the load is much less, i.e., 11 rather than 11.8?

np.random.seed(4)
a = np.random.poisson(11, 10000)
s = 12.*np.ones_like(a)
Qe1= computeQExtra(a,s,1)
Qe2= computeQExtra(a,s,2)
Qe5= computeQExtra(a,s,5)

plt.ylim(0, 200)
plt.plot(Q, label="Q")
plt.plot(Qe1, label="Qe1")
plt.plot(Qe2, label="Qe2")
plt.plot(Qe5, label="Qe5")
plt.legend()
../../_images/multiServerWithFailures_figure16_1.png

In all cases, the policy with \(e=1\) appears to work quite ok.

Computing the waiting times

Work in progress.

a = np.array([2,3,2,1,0,0,1,0,0])
A = np.cumsum(a)
Ainv = np.repeat(np.arange(len(a)), a)

s = [1]*len(a)
Q = computeQ(a,s)
D = A - Q

d = np.zeros(len(D), dtype=int)
d[1:] = np.diff(D)
Dinv = np.repeat(np.arange(len(d)), d)
print(Ainv)
print(Dinv)


f, ax = plt.subplots(2, sharex = True, sharey = True)
plt.xlim(0,10)
plt.ylim(0,10)
ax[0].plot(A, drawstyle='steps-post')
ax[0].plot(D, drawstyle='steps-post')
ax[1].plot(Ainv, drawstyle='steps-post', label = "a")
ax[1].plot(Dinv, drawstyle='steps-post', label = "d")
plt.legend()
[0 0 1 1 1 2 2 3 6]
[1 2 3 4 5 6 7]
../../_images/multiServerWithFailures_figure17_1.png