Abstract
The N-system with independent Poisson arrivals and exponential server-dependent service times under the first come first served and assign to the longest idle server policy has an explicit steady-state distribution. We scale the arrival rate and the number of servers simultaneously, and obtain the fluid and central limit approximation for the steady state. This is the first step toward exploring the many-server scaling limit behavior of general parallel service systems.
Similar content being viewed by others
1 Introduction
In this paper we study the many-server N-system shown in Fig. 1, with Poisson arrivals and exponential service times, under the first come first served and assign to the longest idle server policy (FCFS–ALIS), as the number of servers becomes large. Before describing the model in detail, we will first discuss our motivation for studying this system.
The N-system is one of the simplest special cases of skill-based routing in parallel server systems, as defined in [9, 15] and further studied in [4, 6, 7, 12,13,14, 17, 19, 20, 22, 23]. The general model has customers of types \(i=1,\ldots ,I\), servers of types \(j=1,\ldots ,J\), and a bipartite compatibility graph G, where \((i,j)\in G\) if customer type i can be served by server type j. Arrivals are renewal with rate \(\lambda \), where successive customer types are i.i.d. with probabilities \(\alpha _i\). There are a total of n servers, of which \(n \theta _j\) are of type j, and service times are generally distributed with rates \(\mu _{i,j}\). Assume the system is operated under the FCFS–ALIS policy, that is, servers take on the longest waiting compatible customer, and arriving customers are assigned to the longest idle compatible server. For this general system, necessary and sufficient conditions for stability (positive Harris recurrence for given \(\lambda \)), or for complete resource pooling (there exists a critical \(\lambda _0\) such that the system is stable for \(\lambda <\lambda _0\), and the queues of all customer types diverge for \(\lambda >\lambda _0\)) cannot be determined by the first moment information alone (as conjectured by an example of Foss and Chernova [9], which is further discussed in [16]). In particular, under FCFS–ALIS, calculation of the matching rates \(r_{i,j}\), which are the long-term average fractions of services performed by servers of type j on customers of type i, in general, is intractable.
In the special case that service rates depend only on the server type, and not on the customer type, with Poisson arrivals and exponential service times, the system has a product form stationary distribution, as given in [2]. In that case matching rates can be computed from the stationary distribution.
The following conjecture was made in [4]. If the system is stable and has complete resource pooling for given \(\lambda ,\,n\), and we let both become large together, the behavior of the system simplifies: there will exist \(\beta _j\) such that servers of type j perform a fraction \(\beta _j\) of the services, and the matching rates \(r_{i,j}\) will converge to the rates for the FCFS infinite matching model with \(G,\alpha ,\beta \), as calculated in [1] (see also [5]). The conjecture is based on the following heuristic argument: in steady state the times that each server becomes available form a stationary process which is only mildly correlated with the other servers, and so servers become available approximately as a superposition of almost independent stationary processes, which in the many-server limit becomes a Poisson process, and server types are then i.i.d. with probabilities \(\beta _j\), while customer types arrive as an i.i.d. sequence with probabilities \(\alpha _i\). This corresponds exactly to the model of FCFS infinite matching. Under FCFS–ALIS it is also possible that while the system is stable, service by all the servers is not pooled. Instead it is decoupled: the bipartite compatibility graph breaks into two or more subgraphs, and when the system is operated under FCFS–ALIS the links connecting the subgraphs are only rarely used. The conjecture then is that under many-server scaling this decoupling is the same as in the FCFS infinite matching model, with the same matching rates.
In our current study of the many-server N-system we shall verify the conjectured many-server behavior for this simple parallel server system. To do so we start from the known stationary distribution of the N-system with many servers, as derived in [2], and study its behavior as \(n\rightarrow \infty \). As it turns out, the product form stationary distribution, even for this simple case, is far from simple, and the derivations of limits, which use summations over server permutations and asymptotic expansions of various expressions, are quite laborious. We feel that this emphasizes the difficulty in verifying the conjectured behavior of the general system, which remains intractable at this time.
We mention that the N-system with just two servers has been the subject of several papers, including [3, 10, 11, 19, 20]. In this paper, our focus is on the N-system with many servers under FCFS–ALIS and its limiting behavior.
The rest of the paper is structured as follows. In Sect. 2 we describe the model, and in Sect. 3 we use some heuristic arguments to obtain a guess at the limiting behavior, where we distinguish between pooled and decoupled modes. In Sect. 4 we verify the heuristic guess and obtain the stationary behavior under many-server scaling. In Sect. 5 we illustrate our results with some numerical examples. To improve the readability of the paper we have put all the proofs for Sect. 4 in the Appendix.
2 The model
In our N-system, customers of types \(c_1\) and \(c_2\) arrive as independent Poisson streams, with rates \(\lambda _{1},\lambda _{2}\). There are skill-based parallel servers, \(n_1\) servers of type \(s_1\) which are flexible and can serve both types, and \(n_2\) servers of type \(s_2\) which can only serve type \(c_1\) customers. In our notation, \(c_1\) customers and \(s_1\) servers are flexible, while \(c_2\) customers and \(s_2\) servers are inflexible. (\(s_2\) servers cannot serve \(c_2\) customers.) We assume service times are all independent exponential, with server-dependent rates. The service rate of an \(s_1\) server is \(\mu _1\); the service rate of an \(s_2\) server is \(\mu _2\). See Fig. 1. We let \(\lambda =\,\lambda _1+\,\lambda _2,\,n=n_1+n_2\). The service policy is FCFS–ALIS.
The system is Markovian. In [2, 3, 21] the following state description for the skill-based parallel server systems under the FCFS–ALIS policy was used: imagine the customers arranged in a single queue by order of arrival, and servers are attached to the customers which they serve, and the remaining idle servers are arranged by increasing idle time in front of the queue; see Fig. 2. The state is then \(\mathfrak {s}=(S_1,q_1,S_2,q_2,\ldots ,S_{n-i},q_{n-i},S_{n-i+1},\ldots ,S_n)\), where \(S_1,\ldots ,S_n\) is a permutation of the n servers; the first \(n-i\) servers are the ordered busy servers, and the last i servers are the ordered idle servers, and where \(q_j,\,j=1,\ldots , n-i\), are the queue lengths of the customers waiting for one of the servers \(S_1,\ldots , S_j\), and skipped (could not be served) by servers \(S_{j+1},\ldots ,S_n\). When service rates depend only on the servers, arrivals are Poisson, and services are exponential, this description is Markovian, as shown in [21]. The reason is as follows: given the permutation of servers, we know for each \(q_j\) exactly what types of customers may be present, and since those customers are in the order in which they arrived, the type of each of them is randomly distributed according to the initial frequencies of customer types, and independent of all others. Hence, each server with a queue in front will have to go through an independent sequence of trials as he scans the customers FCFS until finding a match, and the specific sequences of customer types in the queues are not relevant to the steady state of the scan. This yields Markovian transition probabilities.
For the special case of the N-system, in steady state, the following three random quantities are important: \(i_1=I_1(\mathfrak {s})\), the number of idle servers of type \(s_1\), \(i_2=I_2(\mathfrak {s})\), the number of idle servers of type \(s_2\), and \(k=K(\mathfrak {s}) \ge 0\), the number of servers of type \(s_2\) which follow the last server of type \(s_1\) in the sequence \(S_1,\ldots ,S_n\). An incoming \(c_2\) customer has to skip \(k\,s_2\) servers and find the last \(s_1\) server to be served. We let \(i=I(\mathfrak {s})\) be the total number of idle servers in steady state. Because of the structure of the N-system and the FCFS–ALIS policy, the following properties hold for \(i=0,\ldots ,n\) and \(k=0,\ldots ,n_2\):
-
(i)
There are no customers waiting for any server which precedes the last \(s_1\) server in the permutation. In other words, for all \(j < \min (n-k, n-i)\) we have \(q_j=0\). In particular, if there is an idle server of type \(s_1\) (meaning \(i > k\)), then there are no waiting customers at all.
-
(ii)
If there are any idle servers, then there are no type \(c_1\) customers waiting for service; in other words, if \(i>0\), then all the waiting customers are of type \(c_2\).
-
(iii)
If there are no idle servers (all servers are busy), then only the last queue can contain type \(c_1\) customers; in other words, if \(i=0\), then the last queue may contain customers of both types, but all the other waiting customers are of type \(c_2\).
Denote
Then a necessary and sufficient condition for stability is
Throughout the paper, we assume the above stability condition. For the stable system, define \(\beta \) as the long-term fraction of customers served by servers of type \(s_1\), and \(1-\beta \) the long-term fraction of customers served by servers of type \(s_2\). Since type \(s_1\) servers are the only ones that can serve type \(c_2\) servers, we must have \(\beta \ge 1-\alpha \), or, equivalently, \(\alpha +\beta \ge 1\). The stable system under FCFS–ALIS may operate in two different modes: it may be that servers of both types share the service of customers of type \(c_1\), in which case \(\beta > 1-\alpha \) and we say that resource pooling occurs for large n, or it may be the case that servers of type \(s_1\) serve almost exclusively only customers of type \(c_2\), and almost all the service of customers of type \(c_1\) is done by servers of type \(s_2\), in which case \(\beta \approx 1-\alpha \) for large n, and we say that the system is decoupled.
Using the results of [1, 2] we can then write the exact stationary distribution of this system. We wish to show that, as the arrival rate and the number of servers increase, the system simplifies, and we get very precise many-server scaling limits, and in particular we find sharp conditions for pooled or decoupled modes of operation. We will investigate the behavior of the system when we fix the values of \(\alpha ,\theta ,\rho \), and let \(n \rightarrow \infty \). To be precise, we shall then have n, \(n_1=\lceil \theta n \rceil ,\,n_2 = n-n_1\), \(\lambda =\rho (\mu _1 n_1+\mu _2 n_2),\,\lambda _1=\alpha \lambda ,\,\lambda _2=(1-\alpha )\lambda \), all of which go to infinity. Average processing times \(1/\mu _1,1/\mu _2\) are fixed and not scaled.
3 Heuristic fluid calculations
In this section we use some heuristic arguments to guess at the fluid behavior of the many-server system. In particular, we calculate a guess for some key quantities. Using these quantities we give a heuristic description of how the system will behave under the FCFS–ALIS policy, in the many-server case, distinguishing between pooled and decoupled modes of operation. The main part of the paper, in Sect. 4, is the verification of these guesses.
We assume some fixed \(\rho<1,\,\delta <1\) so that the system under FCFS–ALIS is stable. We then observe that under many-server scaling there will almost always be some idle servers available of both types and customers will almost never wait, so that they will enter service immediately upon arrival. At the same time, when a server completes a service there will almost never be any waiting customers, so, after almost every service completion, the server will experience some idle time. Because our policy is ALIS, when a server becomes idle, he always joins the end of a queue of idle servers. In a slight abuse of the notation, we reuse \(I_1, I_2\) and K to denote, respectively, the stationary numbers of servers of type \(s_1\), \(s_2\) and the servers of type \(s_2\) which follow the last server of type \(s_1\) in \(\mathfrak {s}\).
When the system is stationary, the sample path of each server will consist of a sequence of cycles, each of which consists of a single service period followed by an idle period (which can be equal to 0). We denote the generic idle periods between services by \(Y_1,Y_2\). We can bound the values of \(T_1,\,T_2\) as follows: servers of type \(s_2\) can serve only customers of type \(c_1\), some of which may also be served by servers of type \(s_1\). Hence, the arrival rate per server is no larger than \(\lambda _1/n_2\), and so the average interval between arrivals is no less than \(n_2/\lambda _1\), and the average service time per arrival is \(1/\mu _2\), hence \(T_2 \ge n_2/\lambda _1 - 1/\mu _2\). Servers of type \(s_1\) serve all customers of type \(c_2\) and may in addition serve some customers of type \(c_1\). Hence, the arrival rate per server is no less than \(\lambda _2/n_1\), and so the average interval between arrivals is no larger than \(n_1/\lambda _2\). The average service time per arrival is \(1/\mu _1\); hence, \(T_1 \le n_1/\lambda _2 - 1/\mu _1\). Hence, we have found that the stationary expected idle time satisfies
We now distinguish three cases for the values of the parameters:
Case I
In this case, by (1) we will have \(T_2>T_1\), and the system will decouple. The reasoning is as follows: because our policy is ALIS, each server, on completion of service, joins the end of the queue of idle servers, and his idle period consists of waiting until all the servers ahead of him who are of his type, as well as all the other servers that can serve customers who are compatible with him, are assigned to customers, and he is then assigned to the next compatible customer.
At the end of his idle period, a server of type \(s_i\) has been idle for \(Y_i\), and he is then the longest idle server of his type. If we assume the idle times \(Y_i\) converge to their means \(T_i\) as the system becomes large, \(i=1,2\), then since \(T_2 > T_1\), we can say that most of the time the longest idle server will be of type \(s_2\). Therefore almost all the arriving customers of type \(c_1\) will be assigned to a server of type \(s_2\), and so servers of type \(c_1\) will serve almost only customers of type \(c_2\).
This implies that in Case I the system under many-server scaling will behave like two separate M/M/s queues. Because servers of type \(s_2\) serve almost all customers of type \(c_1\), and servers of type \(s_1\) serve all customers of type \(c_2\) and almost none of the customers of type \(c_1\), we have, for large n,
and inequalities (1) will be close to equalities, and we will have (by Little’s law)
We can also estimate the value of K, the location of the first type \(s_1\) server. Since service completions of customers of type \(c_1\) occur at rate \(\lambda _1\) and almost all of those are served by type \(s_2\), and service completions of customers of type \(c_1\) occur at rate \(\lambda _1\) and all of those and almost no others are served by type \(s_2\), servers of type \(s_2\) and \(s_1\) join the end of the queue of idle servers at the ratio of \(\lambda _1/\lambda _2\), so \((I_2-K)/I_1 \approx \lambda _1/\lambda _2\) and
It is worthwhile to note that the condition of Case I that implies decomposition is not simply that \(\delta > \rho \), which is equivalent to \(\frac{\lambda _2}{n_1\mu _1}>\frac{\lambda _1}{n_2\mu _2}\) (the load of customers of type \(c_2\) on servers of type \(s_1\) is higher than the load of customers of type \(c_1\) on servers of type \(s_2\)). In fact, under FCFS, servers of both types may share service of customers of type \(c_1\) even when \(\delta >\rho \). To explain, when \(\delta > \rho \), under decoupled service, the load and therefore the busy time percentage of type \(s_2\) servers is smaller than the load of type \(s_1\) servers, but, if \(\mu _1<\mu _2\), the idle time of type \(s_2\) servers (\(Y_2\)) could be shorter than that of type \(s_1\) servers (\(Y_1\)). In that case, under FCFS the work of \(c_1\) customers will be shared by both types of servers.
The stationary behavior of the decoupled system is described in Fig. 3. In this figure we have, from left to right, a section of busy servers of both types serving all the customers in the system, followed by a section of more recent queueing idle servers of mixed types, followed by a section of the oldest idle servers, all of which are of type \(s_2\). Servers that complete service join the queue of idle servers at its left end. Arriving customers of type \(c_1\) pick the oldest waiting server, which is of type \(c_2\); arriving customers of type \(c_2\) skip all the K servers of type \(s_2\), and pick the oldest idle server of type \(s_1\). Note that the idle servers of both types are mixed in the middle section, and \(I_2\ne I_1+K\).
The exact limiting behavior under many-server scaling for Case I is derived in Sect. 4.4, where the heuristic calculations are verified. Our main results for Case I are:
-
The probability that \(K=0\) converges to 0 as \(n\rightarrow \infty \), and so every customer of type \(c_1\) is served by a server of type \(s_2\).
-
The two sets of servers and their customers behave like independent \({M/M/}n_1\) and \({M/M/}n_2\) queues.
Case II
In this case, we argue that \(T_1 \rightarrow T_2\) as \(n\rightarrow \infty \). Assume to the contrary that \(T_1 > T_2\) as \(n\rightarrow \infty \). Then, for large n, we should have that most of the time the longest idle server will be of type \(s_1\). But \(s_1\) servers can serve all customers, and so by ALIS \(s_1\) servers will serve almost all the customers in the system, which is a contradiction. Now assume that \(T_2>T_1\) as \(n\rightarrow \infty \). But in that case we already argued that the system will decouple and so the inequalities in (1) will hold as equalities, which, since we are in Case II, contradicts \(T_2>T_1\). Therefore, there is no decoupling in Case II, and we conclude that, for large n,
Our first conclusion from \(T_2 > \frac{n_2}{\lambda _1} - \frac{1}{ \mu _2}\) is that servers of type \(s_2\) do not serve all the customers of type \(c_1\), so \(1-\beta < \alpha \), i.e., \(\alpha +\beta >1\), and from \(T_1 < \frac{n_1}{\lambda _2} - \frac{1}{\mu _1}\) we conclude that servers of type \(s_1\) serve some customers of type \(c_1\) as well as customers of \(c_2\) (again, \(\beta > 1-\alpha \)).
The following is a heuristic description of the behavior of the system in Case II under many-server scaling. When n increases, the (random) number of idle servers becomes large, of order O(n), and successive servers join the queue of idle servers at short intervals (of expected length \(1/\lambda \), which is O(1 / n)). They will spend a time of O(1) to traverse the queue and will then reach the head of the queue of idle servers with short intervals between them. At this point they will need to wait for a compatible customer, and this waiting time does depend on the type of server, but because \(\lambda \) is large, once a server is at the head of the line his wait for a compatible customer will be short; hence, successive server arrivals to the idle queue are close to each other and so are their departures from the idle queue. So, as \(n\rightarrow \infty \), not only does \(T_1=T_2\), but also the idle times, \(Y_1\) and \(Y_2\), have the same distribution, and K is of order O(1). This heuristic description will be verified in Sect. 4.
We denote by T the presumed common value of \(T_1\) and \(T_2\). We now calculate the value of T. Let T be the average length of the idle time, common to all servers. The average cycle times will be \(1/\mu _1+T\) and \(1/\mu _2+T\). We defined \(\beta \) as the long-run fraction of services performed by \(s_1\) servers, with \(1-\beta \) services by type \(s_2\). The cycle rate of one type \(s_1\) server is \(1/(1/\mu _1+T)\); hence, the processing rate of all type \(s_1\) servers is \(n_1/(1/\mu _1+T)\), which should equal \(\lambda \beta \). Similarly, the flow rate out of all type \(s_2\) servers should equal \(\lambda (1-\beta )\). That is,
Now we solve for T and \(\beta \) to obtain
and a quadratic equation for T:
Here \(g(0)<0\) because \(\rho <1\), so the equation has one positive and one negative root. Solving for positive T we get
Note: for the case of \(\mu _1 = \mu _2 = \mu \) we get \(T=\frac{1-\rho }{\rho }\frac{1}{\mu }\).
From T and Little’s law we can obtain \(m_{i}\), the approximate average number of idle servers in pool \(i, i=1,2\):
When \(T_1=T_2\), servers are pooled. Servers share the load, and both types of customers receive similar levels of service. The pooled behavior of the system for FCFS–ALIS under many-server scaling is our main interest in this paper. Figure 4 shows the analog of Fig. 3 for the pooled system. Note that the idle servers of both types are mixed, and \(I_2\ne I_1\).
Case III
This case lies on the boundary of the other two cases. As a sanity check, on the one hand, we see that setting \(T_1 = \frac{n_2}{\lambda _1} - \frac{1}{ \mu _2}\) and \(T_2= \frac{n_2}{\lambda _1} - \frac{1}{ \mu _2}\) would correspond to the values for Case I, and result in \(T_1=T_2\). On the other hand, considering the equation (2) for Case II, if we substitute
therefore, \(\alpha +\beta =1\).
4 Many-server limit of the stationary distribution
In this section, we keep the stability assumption \(\rho<1,\,\delta <1\) and derive the many-server limit from the exact stationary distributions.
4.1 Exact stationary distributions
We first obtain the stationary distribution for each state \(\mathfrak {s}\). We note that the stationary probabilities depend mainly on the values of \(k,i_1, i_2\). Let \(\mu (S_j)\) denote the service rate of the server at position j.
Theorem 1
The stationary distribution of the state \(\mathfrak {s}\) of the FCFS–ALIS many-server N-system is given by
where B is a normalizing constant.
Proof
This follows for all three parts of (5) by utilizing properties (i),(ii),(iii) in Sect. 2 and substituting into Equation (2.1), Theorem 2.1, in [2]. \(\square \)
Before we manipulate Eq. (5), we introduce a lemma to facilitate the calculation.
Lemma 1
Letting \(A_1,\ldots ,A_m\) denote a permutation of m given positive real numbers \(a_1,\ldots ,a_m\), we have
where \(\mathcal {P}(a_1,\ldots ,a_m)\) denotes the set of all the permutations of \(a_1,\ldots ,a_m\).
Now we can get the joint stationary distribution of \(K,\,I_1,\,I_2\). We denote by \(\pi (k,i_1,i_2)\) the stationary probability of \(K=k\), \(I_1=i_1\) and \(I_2=i_2\).
Theorem 2
The steady-state joint distribution of \(K,\,I_1,\,I_2\) is given by
where \(B_1\) is a normalizing constant.
4.2 The distribution of \((I_1,I_2)\) given K
In this section we obtain the asymptotic distribution of \((I_1,I_2)\) conditional on \(K=k\), as \(n\rightarrow \infty \). We first show that, as \(n\rightarrow \infty \), the probability of no idle servers of type \(s_1\) goes to zero, and so the probability that customers need not wait goes to 1. Next we condition on \(K=k\) and show \(I_1/n {\mathop {\longrightarrow }\limits ^{p}} f_1,\,I_2/n {\mathop {\longrightarrow }\limits ^{p}} f_2\), where
where T is given in (3). Finally, we condition on \(K=k\) and show that the scaled and centered values of \((I_1,I_2)\) converge in distribution to a bivariate normal distribution. Proofs of the following theorems can be found in the Appendix.
Theorem 3
When \(n\rightarrow \infty \), there exists an \(\epsilon >0\) such that
From this theorem we see that when \(n\rightarrow \infty \), \(P(I_1>0)\rightarrow 1\). Therefore, \(P(K=k,I_1>0) \rightarrow P(K=k)\) for any \(0\le k\le I_2\). From Eq. (6), given \(K=k\), the limiting stationary distribution as \(n\rightarrow \infty \) is
Theorem 4
Conditional on \(K=k\), \(\left( \frac{I_1}{n},\,\frac{I_2}{n}\right) \) converges to \((f_1,\,f_2)\) in probability for any \(k\ge 0\). That is, for any \(\epsilon >0\), when \(n\rightarrow \infty \), we have
After showing the fluid limit result, we are now ready to show the central limit result.
Theorem 5
For any \(k\ge 0\), when \(n\rightarrow \infty \), we have
where
Note that the above is consistent with the bivariate normal distribution stated in Sect. 3.
4.3 Case II: Pooled system
Now we consider Case II, where \(\frac{n_2}{\lambda _1} - \frac{1}{ \mu _2} < \frac{n_1}{\lambda _2} - \frac{1}{ \mu _1}\). First we show the limit distribution of K, the location of the first type \(s_1\) server.
Theorem 6
In Case II, for any \(k\ge 0\), as \(n\rightarrow \infty \),
Theorem 6 shows that K converges in distribution to a geometric distribution in Case II, so \(P(K<\infty )=1\). Therefore, we can extend Theorems 4 and 5 into unconditional versions.
Theorem 7
In Case II, as \(n\rightarrow \infty \), K becomes independent of \(I_1\) and \(I_2\). \(\left( \frac{I_1-f_1n}{\sqrt{n}},\,\frac{I_2-f_2n}{\sqrt{n}}\right) \) converges in distribution to the bivariate normal distribution described in (10).
Consider the special case when \(\mu _1=\mu _2=\mu \). Then \(\theta =\beta \), \(f_1 = (1-\rho )\theta \) and \(f_2 = (1-\rho )(1-\theta )\). When \(n\rightarrow \infty \), \(\left( \frac{I_1-(1-\rho )n_1}{\sqrt{n}},\,\frac{I_2-(1-\rho )n_2}{\sqrt{n}}\right) \) converges in distribution to a bivariate normal distribution with mean (0, 0), variance
and correlation
The total idleness has mean of \((1-\rho )n\) and variance of
4.4 Case I: Decoupling to two independent systems
We now assume \(\frac{n_2}{\lambda _1} - \frac{1}{ \mu _2} > \frac{n_1}{\lambda _2} - \frac{1}{ \mu _1}\), where we find that under many-server scaling the system decouples into two independent M/M/s service systems. We first show the following proposition:
Proposition 1
In Case I, as \(n\rightarrow \infty \), we have \(P(\alpha I_1 \ge (1-\alpha ) I_2) = o\left( \frac{1}{\sqrt{n}}\right) \).
We next obtain the conditional distribution \(K|(I_1,I_2)\).
Theorem 8
Given \(I_1=i_1n,I_2=i_2n\), where \(i_1\in (0,\theta ), i_2\in (0,1-\theta )\), and \(i_2>\frac{\alpha }{1-\alpha }i_1\), we have
Therefore, given \((1-\alpha )I_2 > \alpha I_1\), \(P(K=0|I_1,I_2)=o\left( \frac{1}{\sqrt{n}}\right) \). Now we have
That means the number of type \(c_1\) customers served by \(s_1\) servers is no more than \(o(\sqrt{n})\), which cannot affect the fluid scaled mean or the diffusion scaled variance of two independent decoupled systems.
Theorem 9
In Case I, as \(n\rightarrow \infty \),
This is exactly the many-server scaling limiting distribution of the number of idle servers in two independent M/M/s queues, one of which has arrival rate \(\lambda _2\), service rate \(\mu _1\), and \(n_1\) servers; the other has arrival rate \(\lambda _1\), service rate \(\mu _2\), and \(n_2\) servers.
Furthermore, K will then consist of \(I_2\) minus the idle servers of type \(s_2\) which are mingled with the \(I_1\) servers of type \(s_1\). The following calculation obtains the mean and variance of K under many-server scaling. We denote by \(I_{2,1}\) the number of idle servers of type \(s_2\) that are mingled with the \(I_1\) idle servers of type \(s_1\). Since the type \(s_1\) servers join the idle servers with rate \(\lambda _2\) and type \(s_2\) servers join the idle servers with rate \(\lambda _1\), we have
where \(W_i\) are i.i.d. random variables independent of \(I_1\), each of them having the distribution of the number of failures before the first success in a sequence of Bernoulli trials with probability of success \(\frac{\lambda _2}{\lambda _1+\lambda _2}\). We have
Furthermore, as \(n\rightarrow \infty \), centered and scaled \(I_{2,1}\) converges to a normal distribution, and is independent of \(I_2\).
It now follows that centered and scaled K also converges to a normal distribution, and centered and scaled \((I_1,I_2,K)\) converge to a multivariate normal distribution. The relevant parameters are
K is correlated with both \(I_1\) and \(I_2\):
4.5 Case III: Slowly decoupling as system becomes large
As \(n\rightarrow \infty \), we have seen that when \(\frac{n_2}{\lambda _1} - \frac{1}{ \mu _2} <\frac{n_1}{\lambda _2} -\frac{1}{ \mu _1} \) (Case II), then \(\frac{K}{n}\rightarrow 0\) in probability, and in fact \(K=O(1)\); when \(\frac{n_2}{\lambda _1} - \frac{1}{ \mu _2} > \frac{n_1}{\lambda _2} -\frac{1}{ \mu _1} \) (Case I), then \(\frac{K}{n}\rightarrow \frac{\lambda _1}{n} \left( \frac{n_2}{\lambda _1} - \frac{1}{ \mu _2} - \frac{n_1}{\lambda _2} +\frac{1}{ \mu _1} \right) > 0\) in probability, and in fact \(K=O(n)\). We now examine Case III, where \(\frac{n_2}{\lambda _1} - \frac{1}{ \mu _2} = \frac{n_1}{\lambda _2} -\frac{1}{ \mu _1} \). We will show that in this case, as n becomes large, with fluid scaling the queues decouple, but with diffusion scaling K has nontrivial behavior.
We first prove a monotonicity result on K as a function of \(\alpha \), which holds for all three cases, I, II, and III. To mark dependence on \(\alpha \) we use the notation \(K_\alpha \).
Proposition 2
Keep all the other parameters fixed and change \(\alpha \). If \(\alpha _1<\alpha _2\), then \(K_{\alpha _1}\) stochastically dominates \(K_{\alpha _2}\).
From the monotonicity and the previous statements for Cases I and II, we conclude:
Corollary 1
In Case III, as \(n\rightarrow \infty \), \(\frac{K}{n} \rightarrow 0\) in probability.
We can in fact derive more precise asymptotic results for \(I_1,I_2,K\) in case III. We note first that the result of Theorem 5 on the limiting distribution of \( \left( \left. \frac{I_1- m_1 }{\sqrt{n}} ,\frac{I_2-m_2}{\sqrt{n}} \right| K= k \right) \) as \(n\rightarrow \infty \), for any fixed k, is valid not just in Case II, but also in Cases I and III. In the following theorem we investigate the limit, for fixed k, as \(n\rightarrow \infty \), of \( \left( \left. \frac{I_1- m_1 }{\sqrt{n}} ,\frac{I_2-m_2}{\sqrt{n}} \right| K= kn \right) \).
Theorem 10
For any \(k\in \left[ 0,1-\theta -\left[ \frac{r-\theta \mu _1}{\mu _2}\right] ^+\right) \), as \(n\rightarrow \infty \), we have
where
where \(f_{1,k}=\frac{T \theta }{T+1/\mu _1},\,f_{2,k}=\frac{T (1-\theta -k)}{T+1/\mu _2}+k\), and \(T>0\) solves
Note that \(f_{i,0}\) equals \(f_i\), defined in Sect. 4.2, for \(i=1,2\). So when \(k=0\), Theorem 10 agrees with Theorem 5. We can now use these results to obtain the centered and scaled limiting behavior of K in Case III.
Theorem 11
In Case III, as \(n\rightarrow \infty \), \(\frac{K}{\sqrt{n}}\) converges to a half truncated normal distribution with density function
where \(\sigma _K^2= \alpha \left( \frac{\lambda }{n} \left( \frac{1}{\mu _2}-\frac{1}{\mu _1}\right) +\frac{\theta }{(1-\alpha )^2}\right) \).
The result of Theorem 11 in combination with Theorem 10 should in principle allow us to obtain the joint distribution of \((I_1,I_2)\). Its centered and scaled limit is, however, not a bivariate normal distribution, and too messy to write down. Theorem 11 directly implies that \(P(K=0)\rightarrow 0\) as \(n\rightarrow \infty \). That means the proportion of type \(c_1\) customers who are served by type \(s_1\) servers goes to 0. Therefore, we can obtain the following fluid limit result:
Corollary 2
In Case III,
which is the same as in Case I.
4.6 Comparison to the bipartite FCFS infinite matching model
The infinite matching model was defined and studied in [1, 5, 8] and is as follows: there are a set of customer types \(\mathcal {C}=\{c_1,\ldots ,c_I\}\) and a probability vector \(\mathbf {\alpha }=(\alpha _1,\ldots ,\alpha _I)\), a set of server types \(\mathcal {S}=\{s_1,\ldots ,s_J\}\) and a probability vector \(\mathbf {\beta }=(\beta _1,\ldots ,\beta _J)\), and a bipartite compatibility graph \(\mathcal {G}\subseteq \mathcal {C}\times \mathcal {S}\). There are two infinite sequences \(C^1, C^2, \ldots \) where \(C^m\) are i.i.d. drawn from \(\mathcal {C}\) with probabilities \(\mathbf {\alpha }\), and \(S^1, S^2, \ldots \) where \(S^n\) are i.i.d. drawn from \(\mathcal {S}\) with probabilities \(\mathbf {\beta }\). The two sequences are matched according to the compatibility graph, using FCFS. That is, \(C^1\) is matched to the earliest \(S^n\) in the server sequence that is compatible with it, and thereafter \(C^m\) is matched to the earliest \(S^n\) in the server sequence that is compatible with it, and that was not matched to one of the customers \(C^1,\ldots ,C^{m-1}\). This model is much simpler than a parallel servers queueing model; because there are no arrival times, no busy or idle servers (only a sequence of service types), and no processing times, only ordered customer types and ordered service types matched in the FCFS manner. This model is tractable: under a condition of complete resource pooling the system reaches a steady state, and in particular it is possible to calculate the matching rate for each compatible pair \(r_{s_j,c_i}\), the frequency of matches that happen between server type \(s_j\) and customer type \(c_i\).
In the special case of the infinite matching model corresponding to the N-system, there are an infinite sequence of customers of types \(c_1,c_2\), where the customer types are i.i.d., the type is \(c_1\) with probability \(\alpha \) and \(c_2\) with probability \(1-\alpha \), and an independent infinite sequence of servers of types \(s_1,s_2\), where the server types are i.i.d., the type is \(s_1\) with probability \(\beta \) and \(s_2\) with probability \(1-\beta \), and the compatibility graph \(\mathcal {G}\) has arcs \(\{(c_1,s_1),(c_1,s_2),(c_2,s_1)\}\). The condition for complete resource pooling is then \(\alpha +\beta >1\), corresponding to Case II in our queueing model. Based on the exact formula in [1], successive customers and servers are matched according to FCFS, with matching rates \(r_{c_1,s_1}=\alpha +\beta -1,\,r_{c_1,s_2}=1-\beta ,\,r_{c_2,s_1}=1-\alpha \).
After n customers have arrived and been matched, there may be some unmatched \(s_2\) servers skipped by the customers. We define \(K_n\) to be the number of unmatched \(s_2\) servers before the first unmatched \(s_1\) server after the first n customers have been matched. We can see that \((K_n)_{n=1}^\infty \) is a Markov chain. If \(K_n=0\), that means server \(S^{n+1}\) is of type \(s_1\), and then a new customer \(C^{n+1}\) will be matched to \(S^{n+1}\) and will add a geometrically distributed number with parameter \(\beta \) to \(K_n\). If \(K_n>0\), then a new customer \(C^{n+1}\) of type \(c_1\) will reduce \(K_n\) by 1, and a new customer \(C^{n+1}\) of type \(c_2\) will add a geometrically distributed number with parameter \(\beta \) to \(K_n\). The steady-state distribution for this Markov chain is that \(P(K_\infty = k) = \left( 1-\frac{1-\beta }{\alpha }\right) \left( \frac{1-\beta }{\alpha }\right) ^{k},\,k\ge 0\), which is exactly the limiting distribution of K in (6). This supports our intuition that when the large N-system is underloaded with resource pooling in Case II, the replenishment of idle servers of types \(s_1\) and \(s_2\) becomes i.i.d with probability \(\beta \) and \(1-\beta \), respectively.
In the infinite matching model, if complete resource pooling fails then there is a subset of customer types whose frequency is larger or equal to the frequency of all the compatible server types. In that case the infinite matching model will not reach steady state. However, in such cases there will be a unique decomposition of the model, so that each component on its own is an infinite matching model with complete resource pooling. In the case of the N-model this will happen when \(\alpha + \beta \le 1\), and then the model will decouple to two subsystems, one consisting of customers and servers of types \(c_1,s_2\), and the other of customers and servers of types \(c_2,s_1\). This is exactly the same decomposition that we observe in Cases I and III.
5 Numerical examples
We test our results by investigating an N-system with \(\lambda =100\), \(n_1=n_2 = 100\), \(\mu _1 = \mu _2 = 1\), \(\rho =0.5\). In this example \(\beta =0.5\), \(\theta \rho (1-\rho +\theta \rho )n=(1-\theta )\rho (1-\rho +(1-\theta )\rho )n=37.5\). We use the exact stationary distribution to verify this. We calculate the expectation and variance of the idle number in each pool exactly, listed in the following table. In this example \(\beta =0.5\). When \(\alpha >0.5\) (Case II), so the average number of idle servers in each pool is close to 50, with variance close to \(\theta \rho (1-\rho +\theta \rho )n=(1-\theta )\rho (1-\rho +(1-\theta )\rho )n=37.5\); when \(\alpha <0.5\) (Case I), resource pooling disappears, and \(s_1\) servers seldom serve \(c_1\) customers. The N-system operates like two separate queues: \(s_1\) servers server \(c_2\) customers, and \(s_2\) servers serve \(c_1\) customers. The utilization of the \(s_1\) server pool is \(\frac{(1-\alpha )\lambda }{n_1}\), and the utilization of the \(s_2\) server pool is \(\frac{\alpha \lambda }{n_2}\). When \(\alpha =0.4\), almost zero portion of services performed by \(s_1\) servers are for \(c_1\) customers, the number of idle \(s_1\) servers can be approximated by a normal distribution with mean \(n_1-(1-\alpha )\lambda =40\) and variance \((1-\alpha )\lambda =60\), whereas the number of idle \(s_2\) servers can be approximated by a normal distribution with mean \(n_2-\alpha \lambda =60\) and variance \(\alpha \lambda =40\); when \(\alpha =0.5\) (Case III), we can see that the means are somewhat close to the fluid prediction 50, whereas we do not have analytic approximation for the variances (Table 1).
References
Adan, I.J.B.F., Weiss, G.: Exact FCFS matching rates for two infinite multi-type sequences. Oper. Res. 60(2), 475–489 (2012)
Adan, I.J.B.F., Weiss, G.: A queue with skill based service under FCFS–ALIS: steady state, overloaded system, and behavior under abandonments. Stoch. Syst. 4(1), 250–299 (2014)
Adan, I., Foley, R., McDonald, D.: Exact asymptotics of the stationary distribution of a Markov chain: a production model. Queueing Syst. 62(4), 311–344 (2009)
Adan, I., Boon, M., Weiss, G.: A design heuristic for skill based parallel service systems. arXiv preprint arXiv:1603.01404 (2014)
Adan, I., Busic, A., Mairesse, J., Weiss, G.: Reversibility and further properties of FCFS infinite bipartite matching. arXiv preprint arXiv:1507.05939 (2015)
Armony, M., Ward, A.R.: Fair dynamic routing in large-scale heterogeneous-server systems. Oper. Res. 58(3), 624–637 (2010)
Bell, S.L., Williams, R.J.: Dynamic scheduling of a system with two parallel servers in heavy traffic with resource pooling: asymptotic optimality of a threshold policy. Ann. Appl. Probab. 11(3), 608–649 (2001)
Caldentey, R., Kaplan, E.H., Weiss, G.: FCFS infinite bipartite matching of servers and customers. Adv. Appl. Probab. 41(3), 695–730 (2009)
Foss, S., Chernova, N.: On the stability of a partially accessible multi-station queue with state-dependent routing. Queueing Syst. 29(1), 55–73 (1998)
Ghamami, S., Ward, A.R.: Dynamic scheduling of a two-server parallel server system with complete resource pooling and reneging in heavy traffic: asymptotic optimality of a two-threshold policy. Math. Oper. Res. 38(4), 761–824 (2013)
Green, L.: A queueing system with general-use and limited-use servers. Oper. Res. 33(1), 162–182 (1985)
Gurvich, I., Whitt, W.: Queue-and-idleness-ratio controls in many-server service systems. Math. Oper. Res. 34(2), 363–396 (2009)
Gurvich, I., Whitt, W.: Service-level differentiation in many-server service system via queue-ratio routing. Oper. Res. 58(2), 316–328 (2010)
Harchol-Balter, M., Crovella, M.E., Murta, C.D.: On choosing a task assignment policy for a distributed server system. J. Parallel Distrib. Comput. 59(2), 204–228 (1999)
Harrison, J.M., Lopez, M.J.: Heavy traffic resource pooling in parallel-server systems. Queueing Syst. 33(4), 339–368 (1999)
Nov, Y., Weiss, G, Zhang, H.: Fluid models of parallel service systems under FCFS. arXiv preprint arXiv:1604.04497 (2016)
Rubino, M., Ata, B.: Dynamic control of a make-to-order, parallel-server system with cancellations. Oper. Res. 57(1), 94–108 (2009)
Shanthikumar, J.G., Yao, D.D.: Comparing ordered-entry queues with heterogeneous servers. Queueing Syst. 2(3), 235–244 (1987)
Tezcan, T., Dai, J.G.: Dynamic control of N-systems with many servers: asymptotic optimality of a static priority policy in heavy traffic. Oper. Res. 58(1), 94–110 (2010)
Tezcan, T.: Stability analysis of N-model systems under a static priority rule. Queueing Syst. 73(3), 235–259 (2013)
Visschers, J., Adan, I.J.B.F., Weiss, G.: A product form solution to a system with multi-type customers and multi-type servers. Queueing Syst. 70(3), 269–298 (2012)
Ward, A.R., Armony, M.: Blind fair routing in large-scale service systems with heterogeneous customers and servers. Oper. Res. 61(1), 228–243 (2013)
Williams, R.J.: On dynamic scheduling of a parallel server system with complete resource pooling. Fields Inst. Commun. 28, 49–71 (2000)
Acknowledgements
We are grateful to Ivo Adan for helpful discussion of this paper. We thank the anonymous reviewer and the associate editor for their constructive comments, which helped us improve the manuscript. The review team noticed that analyzing only Case II left major gaps in the original version, which resulted in the addition of the analysis of Cases I and III.
Author information
Authors and Affiliations
Corresponding author
Additional information
Research supported in part by Israel Science Foundation Grants 711/09 and 286/13.
Appendices
A Appendix: Proofs for Sect. 4.1
Proof of Lemma 1
We prove this lemma by induction. Define the left-hand side as \(C_m\).
\(\square \)
Proof of Theorem 2
Summation over the geometric terms \(q_j=0,\ldots ,\infty \) in (5) gives
Next we see that in this expression, permutations of \(S_1,\ldots ,S_n\) with the same \((k,i_1,i_2)\) have a similar structure. We now sum over all the permutations of the appropriate \(S_j,\,1\le j \le n-\max \{k+1,i_1+i_2\}\). By Lemma 1 we obtain
Each permutation of the remaining servers, \(S_j,\, n-\max \{k+1,i_1+i_2\} < j \le n\) has the same stationary probability. It remains to count the number of permutations. When \(i_1=0\) we have \(i_2 \le k\). For each permutation we choose 1 type \(s_1\) server and k out of \(n_2\) type \(s_2\) servers to form the last \(k+1\) servers. The number of permutations is
When \(i_1>0\), we have \(i_2\ge k\). For each permutation, we choose \(i_1\) out of \(n_1\) type \(s_1\) servers and \(i_2\) out of \(n_2\) type \(s_2\) servers. We then choose 1 from the \(i_1\) idle servers of type \(s_1\), and k from the \(i_2\) idle servers of type \(s_2\) to obtain the last \(k+1\) servers. The number of permutations is
Multiplying the terms in (11) by the appropriate number of permutations and defining \(B_1=B\mu _1^{-n_1}\mu _2^{-n_2}\) gives (6). \(\square \)
B Appendix: Proofs for Sect. 4.2
Proof of Theorem 3
We prove the theorem in three steps:
-
(i)
We show that
$$\begin{aligned} P\left( I_1=0\right) \sim B_1 \frac{1}{1-\delta } \times \left\{ \begin{array}{ll} \sqrt{2\pi n_2}\exp \left( n_2\left( -\log \,\kappa +\kappa -1\right) \right) , &{} \quad 0< \kappa < 1, \\ \sqrt{2\pi n_2}/2, &{} \quad \kappa =1 , \\ \frac{1-(1-\alpha )\rho }{1-\rho }+\frac{1}{\kappa -1}, &{} \quad \kappa > 1, \end{array} \right. \end{aligned}$$where \(\kappa =\frac{\lambda _1}{\mu _2n_2}\). Note that \(-\log \,\kappa +\kappa -1\ge 0\).
-
(ii)
We show that
$$\begin{aligned}&P(I_1=\lceil m_1\rceil , I_2=\lceil m_2\rceil ,K=0) \\&\quad \sim B_1\left( \frac{2\pi \beta n_1n_2}{(n_1-m_1)(n_2-m_2)m_2}\right) ^{1/2} \exp \left[ -n_1 \left( \log \left( 1-\frac{m_1}{n_1}\right) +\frac{m_1}{n_1} \right) \right] \\&\qquad \exp \left[ -n_2 \left( \log \left( 1-\frac{m_2}{n_2}\right) +\frac{m_2}{n_2} \right) \right] , \end{aligned}$$where \(\sim \) means the ratio of the two sides converges to 1 when \(n\rightarrow \infty \), \(m_1\) and \(m_2\) are defined in (4). Note that the definition in (4) does not require a specific case. And for all cases, we have \(\frac{m_1}{m_2}=\frac{\beta }{1-\beta }\).
-
(iii)
We show that, as \(n\rightarrow \infty \),
$$\begin{aligned} \frac{P\left( I_1=0\right) }{P(I_1=\lceil m_1\rceil , I_2=\lceil m_2\rceil ,K=0)} = o(\exp (-\epsilon n)), \end{aligned}$$for some \(\epsilon >0\), which proves the theorem.
The details of the proofs of these three steps are as follows:
Proof of (i):
First we calculate
We use induction to calculate
from \(m=n_2\) to \(m=1\). When \(m=n_2\),
Suppose
then
Therefore, the induction is valid and we have
Next we calculate
Similar to the induction calculating \(U_m\) above, we can obtain
Therefore,
where X is a Poisson random variable with parameter \(\frac{\lambda _1}{\mu _2}\). Using Stirling’s approximation,
Recall that \(\kappa =\frac{\lambda _1}{\mu _2n_2}\) and note that \(\log \kappa +1-\kappa \le 0\). Note also that, when \(n\rightarrow \infty \), X can be approximated by a normal distribution with mean \(\frac{\lambda _1}{\mu _2}\) and variance \(\frac{\lambda _1}{\mu _2}\). Next we analyze \(\frac{P(X<n_2)}{P(X=n_2)}\) in three cases depending on \(\kappa \).
-
When \(0<\kappa <1\), from the normal distribution approximation, when \(n\rightarrow \infty \), \(P(X<n_2)\rightarrow 1\). Therefore,
$$\begin{aligned} P\left( I_1=0,\,I_2>0\right) \sim B_1 \frac{1}{1-\delta } \left( \sqrt{2\pi n_2}\exp \left( -n_2\left( \log \kappa +1-\kappa \right) \right) \right) . \end{aligned}$$ -
When \(\kappa =1\), \(-\log \kappa +\kappa -1=0\). When \(n\rightarrow \infty \), the normal distribution approximation gives \(P(X<n_2)\rightarrow \frac{1}{2}\).
$$\begin{aligned} P\left( I_1=0,\,I_2>0\right) \sim B_1 \frac{1}{1-\delta }\frac{1}{2}\sqrt{2\pi n_2}. \end{aligned}$$ -
When \(\kappa > 1\), when \(n\rightarrow \infty \), the normal distribution approximation gives \(P(X<n_2)\rightarrow 0\). We need more care to treat this case. For any \(1\le j\le n_2\),
$$\begin{aligned} \frac{P(X=n_2-j)}{P(X=n_2)} = \frac{\left( \frac{\lambda _1}{\mu _2}\right) ^{n_2-j}\frac{1}{(n_2-j)!}}{\left( \frac{\lambda _1}{\mu _2}\right) ^{n_2}\frac{1}{n_2!}} =\frac{n_2!}{\kappa ^j n_2^j (n_2-j)!} < \frac{1}{\kappa ^j}. \end{aligned}$$Therefore,
$$\begin{aligned} \frac{P(X<n_2)}{P(X=n_2)}\le \sum _{j=1}^{n_2}\frac{1}{\kappa ^j}<\frac{1}{\kappa -1}. \end{aligned}$$In fact, for any fixed j, when \(n\rightarrow \infty \),
$$\begin{aligned} \frac{P(X=n_2-j)}{P(X=n_2)} \rightarrow \frac{1}{\kappa ^j}. \end{aligned}$$For any \(\epsilon >0\), let \(J=\lceil \frac{-\log \epsilon }{\log \kappa }\rceil \). We have \(\epsilon \ge \kappa ^{-J}\). There exists an N such that, when \(n>N\), for any \(1\le j\le J\),
$$\begin{aligned} \frac{P(X=n_2-j)}{P(X=n_2)} - \frac{1}{\kappa ^j}>-\frac{\epsilon }{J}. \end{aligned}$$Therefore,
$$\begin{aligned} \frac{P(X<n_2)}{P(X=n_2)} > \sum _{j=1}^{J}\frac{1}{\kappa ^j}-\epsilon = \frac{1-\kappa ^{-J}}{\kappa -1}-\epsilon \ge \frac{1}{\kappa -1}-\frac{\kappa \epsilon }{\kappa -1}. \end{aligned}$$Therefore, when \(n\rightarrow \infty \),
$$\begin{aligned} \frac{P(X<n_2)}{P(X=n_2)} \rightarrow \frac{1}{\kappa -1}. \end{aligned}$$We have
$$\begin{aligned} P\left( I_1=0,\, I_2>0\right) \sim B_1 \frac{1}{1-\delta }\frac{1}{\kappa -1}. \end{aligned}$$
In summary, when \(\kappa \le 1\), \(P\left( I_1=0,\,I_2=0\right) \) is negligible compared with \(P\left( I_1=0,\,I_2>0\right) \) when \(n\rightarrow \infty \). We have
Proof of (ii):
From Eq. (6) we have
The second equality is due to \(\frac{m_1}{m_1+m_2}=\beta \), \(\frac{m_2}{m_1+m_2}=1-\beta \), \(\frac{\mu _1(n_1-m_1)}{\lambda }=\beta \), \(\frac{\mu _2(n_2-m_2)}{\lambda }=1-\beta \).
Proof of (iii):
Since \(\log (1-x)+x < 0\) when \(0<x<1\), we have
When \(n\rightarrow \infty \), note that \(\left( \frac{2\pi \beta n_1n_2}{(n_1-m_1)(n_2-m_2)m_2}\right) ^{1/2}\) is of the order of \(n^{-1/2}\). Therefore, \(P(I_1=\lceil m_1\rceil , I_2=\lceil m_2\rceil ,K=0)/B_1\) increases exponentially. When \(\kappa >1\), \(P(I_1=0)/B_1\) converges to a constant; when \(\kappa =1\), \(P(I_1=0)/B_1\) increases in the order of \(\sqrt{n}\). Therefore, when \(n\rightarrow \infty \) and \(\kappa \ge 1\),
for some \(\epsilon >0\). When \(\kappa <1\),
We have that
which is nonpositive no matter whether \(\alpha +\beta \) is larger than, equal to, or small than 1. Therefore, when \(n\rightarrow \infty \),
for some \(\epsilon >0\). This completes the proof. \(\square \)
Proof of Theorem 4
First we show that the weak convergence is valid given \(K=0\). Then we show that the same holds when \(K=k\), for any fixed k. When \(K=0\), we prove the convergence in probability in two steps:
-
(i)
We show that for all states \(|I_1-m_1|\ge \epsilon n \text{ or } |I_2-m_2|\ge \epsilon n\), the conditional probability is dominated by a bounded constant multiple of the conditional probability of some point on the boundary of the rectangle \(|I_1-m_1| \le \epsilon n\,\times \,|I_2-m_2|\le \epsilon n\).
-
(ii)
When \(n\rightarrow \infty \), we approximate the conditional probability of the points in the rectangle \(|I_1-m_1| \le \epsilon n\,\times \,|I_2-m_2|\le \epsilon n\). We then show that the probability of points on the boundary is negligible compared with the conditional probability at \((\lceil m_1\rceil ,\lceil m_2\rceil )\).
Proof of (i):
where \(B_2=B_1/P(K=0)\).
We look at several cases:
-
when \(i_1\le m_1\) and \((1-\beta )i_1<\beta i_2\), we have \(\frac{i_1+i_2}{i_1}>\frac{1}{\beta }\). Therefore, \(\frac{P(I_1=i_1+1, I_2=i_2|K=0)}{P(I_1=i_1,I_2=i_2|K=0)}>1\);
-
when \(i_2\le m_2\) and \((1-\beta )i_1>\beta i_2+1\), we have \(\frac{i_1+i_2}{i_2+1} > \frac{1}{1-\beta }\). Therefore, \(\frac{P(I_1=i_1, I_2=i_2+1|K=0)}{P(I_1=i_1,I_2=i_2|K=0)}>1\);
-
when \(i_1> m_1\), \(i_2>m_2\) and \((1-\beta )i_1\ge \beta i_2\), we have \(\frac{i_1+i_2}{i_1}\le \frac{1}{\beta }\). Therefore, \(\frac{P(I_1=i_1+1, I_2=i_2|K=0)}{P(I_1=i_1, I_2=i_2|K=0)}<1\);
-
when \(i_1> m_1\), \(i_2>m_2\) and \((1-\beta )i_1\le \beta i_2+1\), we have \(\frac{i_1+i_2}{i_2+1}\le \frac{1}{1-\beta }\). Therefore, \(\frac{P(I_1=i_1, I_2=i_2+1|K=0)}{P(I_1=i_1, I_2=i_2|K=0)}<1\);
-
when \(\beta i_2\le (1-\beta )i_1\le \beta i_2+1\), \(i_1\le m_1-\epsilon n\) and \(i_2 \le m_2-\epsilon n\), as long as \(\frac{n_2-i_2}{n_2-m_2}\frac{i_2}{i_2+1}>1\), we have \(\frac{P(I_1=i_1, I_2=i_2+1|K=0)}{P(I_1=i_1, I_2=i_2|K=0)}>1\). When n is large, this requires
$$\begin{aligned} i_2> i_2^*=\frac{1-\theta -f_2}{f_2}. \end{aligned}$$As long as \(\frac{n_1-i_1}{n_1-m_1}\frac{i_1-1}{i_1}>1\), we have \(\frac{P(I_1=i_1+1, I_2=i_2|K=0)}{P(I_1=i_1, I_2=i_2|K=0)}>1\). When n is large, this requires
$$\begin{aligned} i_1> i_1^*=\frac{\theta }{f_1}. \end{aligned}$$
For all \(i_1>i_1^*\) or \(i_2>i_2^*\), we can move the state to a neighbor state with larger steady-state probability, as shown in Fig. 5.
Eventually the movement stops at the boundary which is \(\epsilon n\) away from \((m_1,m_2)\). Therefore, the probability of any state \((i_1,i_2)\) satisfying \(i_1>i_1^*\) or \(i_2>i_2^*\) would be dominated by the probability of some point at the boundary.
For any \((i_1,i_2)\) satisfying \(i_1\le i_1^*\) and \(i_2\le i_2^*\), since
we have
and \(P(I_1=i_1^*+1, I_2=i_2^*+1|K=0)\) is dominated by the probability of some point at the boundary.
Proof of (ii):
When \(i_1\in \left[ m_1-\epsilon n,m_1+\epsilon n\right] \) and \(i_2\in \left[ m_2-\epsilon n,m_2+\epsilon n\right] \), and n grows large, we can use Stirling’s approximation.
where \(B_3=B_2n_1!n_2!(2\pi )^{-\frac{3}{2}}\hbox {e}^{n}\), \(x_1=\frac{i_1}{n}\), \(x_2=\frac{i_2}{n}\). We have \(x_1\in [f_1-\epsilon ,f_1+\epsilon ]\) and \(x_2\in [f_2-\epsilon ,f_2+\epsilon ]\). We define
The first-order derivatives on \(x_1\) and \(x_2\) are
Solving the first-order conditions gives
Consider the second-order derivatives:
The Hessian matrix is negative definite. Therefore, \(F(x_1,x_2)\) is strictly concave on \((0,\theta )\times (0,1-\theta )\) and reaches its unique global maximum at \((f_1,f_2)\). The maximum of \(F(x_1,x_2)\) on \([\delta ,\theta -\delta ]\times [\delta , 1-\theta -\delta ]\backslash (f_1-\epsilon ,f_1+\epsilon )\times (f_2-\epsilon ,f_2+\epsilon )\) is on the boundary \(\{(x_1,x_2)\vert |x_1-f_1|=\epsilon ,|x_2-f_2|=\epsilon \}\). Since the boundary is a compact set, the maximum is attainable, denoted by \(F(f_1,f_2)-\eta \), where \(\eta >0\).
Note that
changes slowly when \(x_1\) and \(x_2\) change, compared with \(\exp \left( nF(x_1,x_2)\right) \). We have
Therefore,
It converges to 0 when \(n\rightarrow \infty \).
When \(K=k>0\), and \(n\rightarrow \infty \), similarly,
We can use a similar two-step argument to show that \((I_1/n,\,I_2/n)\) converges to \((f_1,\,f_2)\) in probability given \(K=k\). \(\square \)
Proof of Theorem 5
To obtain the asymptotic distribution of \(I_1,I_2\) as \(n\rightarrow \infty \), we need to consider, by Theorem 4, only values \(i_1,i_2\) for which \((i_1 - m_1)/n \rightarrow 0\) and \((i_2-m_2)/n\rightarrow 0\). We write \(i_1=m_1+z_1\sqrt{n},\,i_2=m_2+z_2\sqrt{n}\), with \(z_1/ \sqrt{n} \rightarrow 0\), \(z_2/ \sqrt{n} \rightarrow 0\). Note that \(m_1,\,m_2,\,n_1-m_1,\,n_2-m_2\) are of the same order of magnitude as \(n,n_1,n_2\), and we only consider \(i_1,i_2\) of the same order of magnitude.
where the use of Stirling’s approximation is justified for large n. Here \(B_2=B_1/P(K=0)\) and \(B_3=B_2n_1!n_2!(2\pi )^{-\frac{3}{2}}\hbox {e}^{n}\).
We clearly have
so we can treat that part as a constant. Consider
Then from the Taylor expansion of the logarithm function, we have
Therefore,
Similar expansions are valid for \(i_2,\,n_1-i_1,\,n_2-i_2\) and \(i_1+i_2\):
We now use the calculations in Sect. 3 to evaluate all the \(\sqrt{n}\) coefficients. By (4) we have
Therefore, we have
where \(B_4= \log \left( B_3 \left( \frac{m_1}{(m_1+m_2) m_2 (n_1-m_1)(n_2-m_2)}\right) ^{1/2}\right) -n_1\log (n_1-m_1)-n_2\log (n_2-m_2)-m_1-m_2\). Define
We have
Therefore, \((\frac{I_1-m_1}{\sqrt{n}},\frac{I_2-m_2}{\sqrt{n}})\) given \(K=0\) converges in distribution as \(n\rightarrow \infty \) to the bivariate normal distribution as stated in (10).
When \(K=k>0\), and \(n\rightarrow \infty \), similarly,
We again write \(i_1=m_1+z_1\sqrt{n},\,i_2=m_2+z_2\sqrt{n}\), with \(z_1/ \sqrt{n} \rightarrow 0\), \(z_2/ \sqrt{n} \rightarrow 0\). We then have
We can now use the same approximation as for \(k=0\) to show that \(\left( \frac{I_1-m_1}{\sqrt{n}},\,\frac{I_2-m_2}{\sqrt{n}}\right) \) converges to the same bivariate normal distribution. \(\square \)
C Appendix: Proofs for Sect. 4.3
Proof of Theorem 6
From (4),
Take a fixed arbitrary \(\epsilon \in \left( 0,\min \{f_1,f_2\}\right) \). Fix \(k>0\). For any \(i_1,i_2\) satisfying \(|i_1/n-f_1|<\epsilon \), \(|i_2/n-f_2|<\epsilon \) and \(i_1\ge 1\), from (6), noting \(\frac{a+c}{b+c}\ge \frac{a}{b}\) for any \(0<a\le b\) and \(c>0\), we have
Therefore,
For fixed \(k_0>0\),
Note the above inequality is valid for any \(i_1,i_2\) satisfying \(|i_1/n-f_1|<\epsilon \), \(|i_2/n-f_2|<\epsilon \). We have
From Theorem 4, there exists an \(N_1\) such that, when \(n>N_1\),
Then we have,
This upper bound can be arbitrarily close to 0 when choosing \(\epsilon \), \(n>N_1\), and \(k_0\). Therefore, we have shown the tightness of K; that is,
Using
for fixed \(k>0\), when \(n>N_1\), the ratio \(\frac{P(K=k)}{P(K=k-1)}\) is lower bounded by
and upper bounded by
For any \(i_1,i_2\) satisfying \(|i_1/n-f_1|<\epsilon \), \(|i_2/n-f_2|<\epsilon \) and \(i_1\ge 1\), in addition to (12), we have the lower bound
Now we have
Therefore,
that is,
For fixed k, as \(n\rightarrow \infty \), the lower bound and the upper bound in (14) both converge to \(\frac{1-\beta }{\alpha }\). Noting that \(\epsilon \) can be arbitrarily close to 0, we have
This, together with the tightness (13), proves (8). \(\square \)
Proof of Theorem 7
When \(\frac{n_1}{\lambda _2} - \frac{1}{\mu _1}>\frac{n_2}{\lambda _1} - \frac{1}{\mu _2}\), the unscaled K converges to a geometric distribution. As we saw in Theorems 4 and 5, as \(n\rightarrow \infty \), the distribution of the scaled deviations of \(I_1,I_2\) conditional on the value of \(K=k\) converges to a normal distribution, with mean and variance that do not depend on k. We can now use the law of total probability and find \(N_0\) large enough so that the unconditional probability distribution of the scaled \(I_1,I_2\) is close to the specified normal distribution when \(n>N_0\). One more step then shows that, as \(n\rightarrow \infty \), the conditional distribution given K is the same, so we have the asymptotic independence. \(\square \)
D Appendix: Proofs for Sect. 4.4
Proof of Proposition 1
Let \(A_1(t)\) be the arrival stream of customers that are served eventually by servers of type \(s_1\), and let \(I_1(t)\) be, as defined above, the number of idle servers of type \(s_1\). We now compare this to an M/M/\(n_1\) system, with type \(s_1\) servers, whose processing times are exponential with rate \(\mu _1\), and with arrival stream \(\tilde{A}_1(t)\) which consists of all the arrivals of the stream \(A_1(t)\) which are customers of type \(c_2\), but excludes arrivals of type \(c_1\). Clearly, \(A_1(t)\ge \tilde{A}_1(t)\) a.s. Denote by \(\tilde{I}_1(t)\) the number of idle servers in the M/M/\(n_1\) system at time t. It then follows directly from Theorem 1 of Shanthikumar and Yao [18] that the stationary distributions of \(I_1\) and \(\tilde{I}_1\) satisfy \(\tilde{I}_1 \ge _{ST} I_1\).
Define similarly an M/M/\(n_2\) system with type \(s_2\) servers, whose processing times are exponential with rate \(\mu _2\) and arrivals \(\tilde{A}_2(t)\) of all the customers of type \(c_1\). Then \(A_2(t)\le \tilde{A}_2(t)\) a.s. and, by the same argument, \(\tilde{I}_2 \le _{ST} I_2\).
As n becomes large, the numbers of idle servers in the two independent M/M/N systems \((\tilde{I}_1(\infty ),\tilde{I}_2(\infty ))\) can be approximated by normal distributions with means
and standard deviations \(\sqrt{\frac{\lambda _2}{\mu _1}}\) and \(\sqrt{ \frac{\lambda _1}{\mu _2}}\), respectively. Since, in Case I,
we have
while the standard deviations are \(O(\sqrt{n})\). Define the middle point \(M=\left( (1-\alpha )\left( n_2 - \frac{\lambda _1}{\mu _2}\right) +\alpha \left( n_1 - \frac{\lambda _2}{\mu _1} \right) \right) \big /2\). As \(n\rightarrow \infty \), we have \(P(\alpha \tilde{I}_1\ge M)=o\left( \frac{1}{\sqrt{n}}\right) \) and \(P((1-\alpha )\tilde{I}_2\le M)=o\left( \frac{1}{\sqrt{n}}\right) \). Therefore,
\(\square \)
Proof of Theorem 8
Given \(i_1\in (0,\theta ), i_2\in (0,1-\theta )\), and \(i_2>\frac{\alpha }{1-\alpha }i_1\), for \(0\le k<i_2\),
where \(B_2=B_1/P(I_1=i_1 n, I_2=i_2 n)\), \(B_3=B_2{n_1\atopwithdelims ()i_1 n}{n_2\atopwithdelims ()i_2 n}i_1 (i_2 n)!\left( \frac{\mu _1}{\lambda }\right) ^{i_1n}\left( \frac{\mu _2}{\lambda }\right) ^{i_2n}\exp (-i_1n)\). Choose k to maximize
The first-order condition is
Therefore, the optimal value is
Given \(K = \bar{k} n + x\sqrt{n}\),
Therefore,
Therefore, \(\left. \frac{K-\bar{k}n}{\sqrt{n}}\right| I_1=i_1 n, I_2=i_2 n\) is a normal distribution with mean 0 and variance
\(\square \)
Proof of Theorem 9
From Theorem 8, we know that, as \(n\rightarrow \infty \), the percentage of type \(c_1\) customers served by type \(s_1\) servers goes to 0 faster than \(O(\frac{1}{\sqrt{n}})\). Therefore, as \(n\rightarrow \infty \), the two server pools are decoupled in the sense that type \(s_1\) servers serving type \(c_1\) customers do not affect the fluid and diffusion limits of the decoupled systems. From the proof of Proposition 1, we know \(\big (I_1-(n_1-\frac{\lambda _2}{\mu _1})\big )/\sqrt{n}\) converges to a normal distribution with mean 0 and variance \(\frac{\lambda _2}{n\mu _1}\); independently, \(\big (I_2-(n_2-\frac{\lambda _1}{\mu _2})\big )/\sqrt{n}\) converges to a normal distribution with mean 0 and variance \(\frac{\lambda _1}{n\mu _2}\). \(\square \)
E Appendix: Proofs for Sect. 4.5
Proof of Proposition 2
We want to show \(\frac{P(K(\alpha _2)=k)}{P(K(\alpha _1)=k)}\) is decreasing in k. Note that
From Theorem 2, given k, for any \(i_1\in \{1,\ldots ,n_1\}, i_2\in \{k,\ldots , n_2\}\),
which is decreasing in k; for any \(i_2=\{1,\ldots , k\}\),
which is decreasing in k;
which is decreasing in k. Therefore, \(\frac{P(K(\alpha _2)=k)}{P(K(\alpha _1)=k)}\) is decreasing in k, that is,
meaning \(K(\alpha _1)\) is larger than \(K(\alpha _2)\) in the likelihood ratio order, implying \(K(\alpha _1)\) stochastically dominates \(K(\alpha _2)\). \(\square \)
Proof of Corollary 1
The condition \(\frac{n_2}{\lambda _1} - \frac{1}{ \mu _2} = \frac{n_1}{\lambda _2} -\frac{1}{ \mu _1} \) is equivalent to \(\alpha = 1-\beta \). By Proposition 2 \(K_{\alpha _1} \ge _{ST} K_{1-\beta } \ge _{ST} K_{\alpha _2}\) whenever \(\alpha _1< 1-\beta < \alpha _2\). But, for all \(1-\beta < \alpha _2\), \(K_{\alpha _2}/n \rightarrow 0\), and for \(\alpha _1 < 1-\beta \), \(\lim _{\alpha _1 \rightarrow 1-\beta } \lim _{n\rightarrow \infty } K_{\alpha _1}/n =0\), and the corollary follows. \(\square \)
Proof of Theorem 10
We prove this theorem in two steps:
-
Prove that fluid limits are \(\lim _{n\rightarrow \infty }I_1/n=f_{1,k}\), \(\lim _{n\rightarrow \infty }I_2/n=f_{2,k}\).
-
Prove the central limit behavior.
When \(i_1\in \left[ m_1-\epsilon n,m_1+\epsilon n\right] \) and \(i_2\in \left[ m_2-\epsilon n,m_2+\epsilon n\right] \), and n grows large, we can use Stirling’s approximation:
where \(B_{2,k}=B_1/P(K=kn)\), \(B_{3,k}=B_{2,k}n_1!n_2!(2\pi )^{-\frac{3}{2}}\hbox {e}^{n}\alpha ^{-kn}\), \(B_{4,k}=B_3 n^{-n-3/2}\left( \frac{x_1}{(x_1+x_2-k)(\theta -x_1)(1-\theta -x_2)(x_2-k)}\right) ^{1/2}\), \(x_1=\frac{i_1}{n}\), \(x_2=\frac{i_2}{n}\). We define
The first-order derivatives on \(x_1\) and \(x_2\) are
We can solve
Look at the second-order derivatives:
The Hessian matrix is negative definite. Therefore, \(F(x_1,x_2)\) is strictly concave on \((0,\theta )\times (0,1-\theta )\) and reaches its unique global maximum at \((f_{1,k},f_{2,k})\). Similar to the proof of Theorem 4, we can show that
To obtain the asymptotic distribution of \(I_1,I_2\) as \(n\rightarrow \infty \), we only need to consider \(i_1,i_2\) for which \(i_1/n \rightarrow f_{1,k}\) and \(i_2/n\rightarrow f_{2,k}\). Similar to the proof of Theorem 5, we write \(i_1= f_{1,k}n+z_1\sqrt{n},\,i_2= f_{2,k}n+z_2\sqrt{n}\), with \(z_1/ \sqrt{n} \rightarrow 0\), \(z_2/ \sqrt{n} \rightarrow 0\).
From the definitions of \(f_{1,k}\) and \(f_{2,k}\) in Theorem 10,
Therefore, similar to the proof of Theorem 5, we can obtain
where \(B_{5,k}= \log \left( B_{3,k} \left( \frac{f_{1,k}}{(f_{1,k}+f_{2,k}-k)(f_{2,k}-k) (\theta -f_{1,k})(1-\theta -f_{2,k})}\right) ^{1/2}\right) -\frac{5}{2}n\log n-n(\theta \log (\theta -f_{1,k})+(1-\theta )\log (1-\theta -f_{2,k})+f_{1,k}+f_{2,k})+kn(\log (f_{2,k}-k)-\log (f_{1,k}+f_{2,k}-k))\). Therefore, organizing the formula, we have
where
\(\square \)
Proof of Theorem 11
The density of the highest point of the approximating binormal distribution in Theorem 10 is
For a large n, letting \(\lceil x \rceil \) be the ceiling of a real number x, and from Theorem 10,
Combined with (15), we have
Recall that
Therefore,
where
Define
From Theorem 10, we can denote k by T,
Note that T is nonnegative and no larger than the value in (3), denoted by \(\overline{T}\). Note also that \(f_{1,k}=\frac{T \theta }{T+1/\mu _1},\,f_{2,k}=\frac{T (1-\theta -k)}{T+1/\mu _2}+k\). Algebra gives
Solving \(\frac{\hbox {d}G}{\hbox {d}T} = 0\) gives
If
then
and G(T) is minimized at \(T^\star =\overline{T}\); otherwise,
and G(T) is minimized at \(T^\star =\frac{\theta }{(1-\alpha ) r } - \frac{1}{ \mu _1}\). When G(T) is minimized at \(T^\star =\overline{T}\), the corresponding \(k^\star =0\), we go back to the pooled system case; when G(T) is minimized at \(T^\star =\frac{\theta }{(1-\alpha ) r } - \frac{1}{ \mu _1}\), the corresponding
By now we have shown that, for any \(\epsilon >0\),
Therefore,
This is consistent with our intuitive calculation in Sect. 3.
Suppose T changes from \(T^\star =\frac{\theta }{(1-\alpha ) r } - \frac{1}{ \mu _1}\) to \(T^\star +x/\sqrt{n}\); then \(f_{1,k}\) changes \(\delta f_1= f^{'}_{1,k}\frac{x}{\sqrt{n}}+ f^{''}_{1,k}\frac{x^2}{2n}+o\left( \frac{1}{n}\right) \), \(f_{2,k}\) changes \(\delta f_2= f^{'}_{2,k}\frac{x}{\sqrt{n}}+ f^{''}_{2,k}\frac{x^2}{2n}+o\left( \frac{1}{n}\right) \), and k changes \(\delta k=k^{'}\frac{x}{\sqrt{n}}+k^{''}\frac{x^2}{2n}+o\left( \frac{1}{n}\right) \),
The coefficient of the \(x\sqrt{n}\) term is
which equals 0. The O(1) term is
The above equals
Noting that
the change from \(k^\star \) to \(k^\star +y \sqrt{n}\) gives
Therefore, we have
Therefore, as \(n\rightarrow \infty \), the variance of \(\frac{K-k^\star n}{\sqrt{n}}\) converges to
This is consistent with our calculation in Sect. 4.4.
If
\(k^\star = 0\), the above calculation is valid only for \(x<0\), and \(\frac{K}{\sqrt{n}}\) converges to a truncated normal distribution. The density function is
Note that \(P(K=0)\sim \frac{2}{n\sqrt{2\sigma _K^2\pi }} \rightarrow 0\) as \(n\rightarrow \infty \).
If
\(k^\star = 0\), the above calculation is no longer valid because the coefficient of the \(x\sqrt{n}\) term is nonzero. From Theorem 6 we know that K converges to a geometric distribution when \(n\rightarrow \infty \). \(\square \)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Zhan, D., Weiss, G. Many-server scaling of the N-system under FCFS–ALIS. Queueing Syst 88, 27–71 (2018). https://doi.org/10.1007/s11134-017-9549-7
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11134-017-9549-7
Keywords
- N-system
- Many-server scaling
- Fluid limits
- Central limits
- First come first served
- Assign to the longest idle server