## A Performance Evaluation of a 3-stage ATM Clos Switch under bursty traffic\* André-Luc Beylot, Iosefina Kohlenberg and Monique Becker Département Informatique, Institut National des Télécommunications 9, rue Charles Fourier 91011 Evry Cedex - FRANCE and Laboratoire MASI 5, place Jussieu 75230 Paris Cedex - FRANCE e-mail: andreluc, kohlen@etna.int-evry.fr, mbecker@int-evry.fr #### **Abstract** Performance of ATM networks will depend on switch performance and architectures. The main problem when designing a switch is due to the fact that the future traffic is unknown. Traffics are expected to be bursty. Input processes into one switch are not mostly source processes (voice, data or video traffic), they are mostly output processes from other switches. So when studying a switch performance, it is necessary to verify whether the assumptions on input processes still hold for output processes. Performance of an ATM switch based on a three-stage Clos Network with output buffers is studied under "Bursty Geometric" arrivals. The aim of the analysis is to dimension the output buffers of each of the three stages of the considered ATM switch. The output traffic is studied. It is well approximated by a bursty geometric process. The interstage traffic and the output traffic of the switch can consequently be approximated by such processes. It validates the input assumptions. An approximate model of the switch is presented. Discrete event simulations are used to validate our model. Analysis of the results shows that the switch dimensioning is important. The use of non-symmetric switching elements reveals itself efficient for bursty traffic. The burstiness has an influence on the cell loss probability but it has no influence on the cell delay and no influence on the best memory repartition for a given global memory size and a given architecture. Keyword Codes: C.2.1; I.6.4 **Keywords**: Computer Communication Networks, Network Architecture and Design, ATM (Asynchronous Transfer Mode) networks, Performance evaluation, Simulation and Modeling, Model Validation and Analysis #### 1. INTRODUCTION A.T.M (Asynchronous Transfer Mode) is the technic recommended by the CCITT for broadband network in the future [1]. Many ATM switch designs have been proposed over the past few years [2]. Several studies of multi-stage interconnection networks under bursty arrivals have already been presented [3-6] (Interrupted Bernoulli Process "IBP" or "bursty geometric" sources). In this paper, a model of a three-stage Clos interconnection network [7-8] under "bursty geometric" arrivals is proposed. <sup>\*</sup> This work was supported by a CNET Grant The problem of modeling input traffics is not simple. Bursty sources may be modelled by bursty geometric processes or by superposition of on-off sources. But the input traffic into switches mainly comes from other switches. So it is important to study the traffic going out of a switch when the input traffic is bursty. This is the question that has to be addressed. Before solving the switch model, a switching element with symmetric bursty geometric input traffic has to be solved. These traffics are assumed to have the same parameters on each of the input links. The input traffic on a switching element of the next stage will be the output traffic of the preceding stage. It is then necessary to show that it is not bad to consider that the output traffic is bursty geometric. The whole switch is then studied and the approximations validated. The model is used in order to size the buffers of the switch. The performance metrics are the cell loss probability and the mean cell response time for different sizes of the different stages output buffers. It is then possible to answer the question: How important is the burstiness of the input traffic on the performance of an ATM Clos switch under symmetric bursty traffic? This paper is organized as follows. In section 2, we decribe the Clos network and the model of input traffics. Analytical models of a switching element are presented in section 3. The output process is studied in section 4. Approximate models of the switch are described in section 5. Results are given in section 6, the model is validated by discrete event simulations. #### 2. DESCRIPTION OF THE NETWORK AND OF INPUT PROCESSES #### 2.1. Three-stage Clos Interconnection Network Let us consider a switch based on a Clos network. The switching elements are supposed to be without internal blocking and it is assumed that each input cell may be switched in one slot. All the internal links are assumed to have the same throughput. As several cells may be switched on the same output port during the same slot, buffers have to be placed at the output ports of the switching elements. The queues are supposed to be FIFO queues with finite capacity. Figure 1. Three-stage Clos network C(N,a,b) Let us assume, that the arrival processes at each input port are "bursty geometric" and that each input link is offered the same traffic load, destination addresses of the cells are uniformly distributed over all the output links of the network. Whereas Banyan networks are single-path, Clos networks are multi-paths. We chose the random policy in the present work: the choice of the matrix of the second stage is uniformly (and randomly) done. The hypothesis of random choice on the first stage is an approximation. It is only valid when the number of calls is large enough and when the load is equidistributed. No backpressure signals are supposed to be exchanged between adjacent stages. Consequently, when a new cell arrives in a full buffer, it will be lost. An analytical model of a Clos network is presented in sections 3 and 4. #### 2.2. Model of sources **2.2.1** Characterization of a "bursty geometric" source "Bursty geometric" processes are discrete time "ON/OFF" processes with two states. During the "ON" state, a packet is emitted at each time slot. "ON" and "OFF" periods are geometrically distributed. A "bursty geometric" source can be represented by a two-state Markov chain. Figure 2. "bursty geometric" source A "bursty geometric" source is defined by the following two parameters: - the probability that a cell is emitted at time t given no cell was emitted at time (t-1): (1-q) - the probability that no cell is emitted at time t given a cell was emitted at time (t-1): (1-p). Let us note that when p+q=1, the process is geometric. The rate $\lambda$ of such sources is equal to the steady state probability of the "ON" state. The expected value of the burst length L<sub>B</sub> and of the silence length L<sub>S</sub> can easily be derived: $$\lambda = \text{Pr}[\text{"ON"}] = \frac{1 - q}{2 - p - q} \qquad \qquad L_B = 1 + \frac{p}{1 - p} \qquad \qquad L_S = 1 + \frac{q}{1 - q}$$ #### 2.2.2. Input process into an output queue Let us consider the traffic going from input link m to output queue n of a first stage switching element. Among the cells of an input process, only some of them go to output n. If the choice of the output is random and equidistributed, the splitted process produced by a "bursty geometric" process on input link m is not a "bursty geometric" process [3]. It is in fact an IBP process (Bernoulli process with parameter 1/s of receiving a cell in the "ON" state if s is the number of output ports). (See hachured cells in Fig.3). Figure 3. Bursty geometric process (hachured cells are destined for output n) Two solutions are proposed is this paper: - approximate the splitted process by a "bursty geometric" process (Model 1). - use the exact IBP process, the transition matrix is then less sparse (Model 2). The parameters of the approximate "bursty geometric" splitted process can be derived as follows. Let us examine output queue **n** of a switching element. The splitted process can be modelled by a three-state Markov chain. The "ON" state is splitted into two states: state 1' corresponds to the "ON" state for which the emitted packet is directed to output **n** and state 1" corresponds to the "ON" state for which the produced packet is directed to another output. Figure 4. splitted "bursty geometric" process An approximation of the splitted process by a "bursty geometric" process can be derived by an aggregation of 0 and 1" states (this aggregation is exact if the input process is Bernoulli). The parameters p' and q' can be expressed as follows. Let p' be the probability that a cell is received at time (t+1) given that a cell was received by output $\mathbf{n}$ at time t. Let q' be the probability that no cell is received at time (t+1) given that no cell was received by output $\mathbf{n}$ at time t. Consequently, $$p' = \frac{p}{s}$$ $q' = 1 - \frac{1-q}{s} \frac{1 - \frac{p}{s}}{2-p-q - \frac{1-q}{s}}$ The rate of this source can easily be obtained (the rate of the approximate model is equal to the actual IBP process). $$\lambda' = \frac{\lambda}{s}$$ Let us examine the interarrival time of a cell. Let T be the interarrival time for the exact model (model 2) and $\Theta$ the interarrival time for the approximate (model 1). It can be shown [5] that for an IBP process (parameters p, q, 1/s) $$E[T] = \frac{s (2-p-q)}{1-q} \qquad \qquad C^{2}[T] = 1 + \frac{1}{s} \left[ \frac{(1-p)(3-q)}{(2-p-q)^{2}} - 2 \right] + \frac{1}{s} \frac{(1-q)^{2}}{(2-p-q)^{2}}$$ We obtain for $\Theta$ (probability equal to 1 of emetting a cell in the "ON" state) $$E[\Theta] = \frac{2-p'-q'}{1-q'}$$ $$C^{2}[\Theta] = \frac{(1-p')(p'+q')}{(2-p'-q')^{2}}$$ The mean of the interarrival time are the same but the squared coefficients of variation differ. Let us present the solution of a switching element under bursty arrivals. #### 3. ANALYSIS OF A SWITCHING ELEMENT Let us consider a switching element with a inputs and s outputs. Finite buffers (capacity M) are on output links. Time is discrete. Throughput is the same on input links as on output links. In one slot, switching is the following: - In each queue the head cell gets out of the switching element. - Cells arrive on each input: (not more than one cell per input) and are put on the output queue they choose. In the case when several cells choose the same output during the same slot, they will be buffered in the queue, back of the cells that were already there, in a random order. #### The main hypotheses are: - Cells are generated in each input link according to independent processes. - Cells choose each of the output ports with the same probability - Departures are treated before arrivals. Let us assume that a cells may be switched in one slot and several of them may be switched to the same output. As the choice of the output port is assumed to be random and equidistributed, we shall study only one output queue. We will study two input processes: Bernoulli arrivals and "bursty geometric" process. For the "bursty geometric" case, two models are proposed, the first one corresponds to the approximation of the splitted process by a "bursty geometric process" and the second one to the actual IBP process. #### 3.1 Analysis of a switching element with symmetric geometric inputs An output queue of a switching element is a n-Geo/D/1/M queue with departure before arrivals. Let L(t) be the number of cells at the beginning of slot t (before departure). Let A(t) be the number of cells that will be placed in the output queue during the slot t. Let A(t) be the number of cells that will be placed in the output queue during the slot L(t) is a Markov process (Fig 5.). Figure 5. Markov Chain - Finite Buffer (M>a) We have $$L(t+1) = Min (M, L(t)+A(t)-\delta_{L(t)})$$ where $\delta_{L(t)}=1$ if $L(t)>0$ (If there is one cell at the beginning of slot t, it will have left the switch at time t+1) A(t) is independent of t and will be noted $b_k$ . $$b_k = Pr[A(t)=k] = {a \choose k} {p \choose s}^k \left(1 - {p \choose s}^{a-k}\right)^{a-k}$$ Let $\pi_k$ be the steady probability for having k cells in the buffer. The input rate $\lambda$ is equal to $\frac{ap}{s}$ and the service time $1/\mu$ is constant and equal to 1. The utilization factor $\rho$ is equal to $(1 - \pi_0)$ . The output rate $\Lambda$ is equal to 1 - $\pi_0$ . The cell loss probability and the mean response time can then be computed. ## 3.2 Analysis of a switching element with bursty geometric input process into output queue n (Model 1) Let p' and q' the transition probability of the approximate "bursty geometric" process. Let L(t) be the number of customers on the output queue at time t and B(t) the number of "ON" sources at time t, (L(t), B(t)) is a Markov process. $$L(t+1) = Min(M, L(t)-\delta_{L(t)}+B(t))$$ where ([9]) $$Pr[B(t+1)=k/B(t)=j] = \sum_{i=\min(0,j-k)}^{\min(j,a-k)} {(\frac{j}{i}) (1-p')^i p'^{j-i} \binom{a-j}{i+k-j} (1-q')^{i+k-j} q'^{a-i-k}}$$ In this approximate model, the transition matrix is very sparse. If the number of customers in the output queue and the number of "ON" sources are known, the number of customers in the queue at time (t+1) can immediately be derived. Let Pr[L=k,B=j] be the steady state probability of state (k,j). Let $\pi_k$ be the steady probability for having k cells in the buffer. The performance criteria can then easily be computed. #### 3.3 Analysis of a switching element with IBP input process (Model 2) Let L(t) be the number of customers in the output queue at time t, C(t) the number of "ON" sources at time t and A(t) is the number of cells that want to join output queue n. A(t) depends on C(t). (L(t), C(t)) is a Markov process. We have $$L(t+1) = Min(M, L(t)-\delta_{I(t)}+A(t))$$ $$Pr[A(t)=k/C(t)=j] = (\frac{j}{k})(\frac{1}{s})^k(1-\frac{1}{s})^{j-k} \text{ for } 0 \le j \le k$$ where C(t) can be expressed as follows $$Pr[C(t+1) = k/C(t) = j] = \sum_{i=max(0,j-k)}^{min(j,a-k)} \left( \begin{array}{c} j \\ i \end{array} \right) \left( 1 - p \right)^{i} p^{j-i} \left( \begin{array}{c} a - j \\ i + k - j \end{array} \right) \left( 1 - q \right)^{i+k-j} q^{a-i-k}$$ The transition matrix is less sparse because A(t) belongs to [0, C(t)]. Let Pr[L=k, C=j] be the steady state probability of state (k,j). Let $\pi_k$ be the steady probability for having k cells in the buffer. The steady state probabilities can be obtained using a matrix-analytic approach [10] since the superposition of n independent IBP processes is a D-BMAP process. The performance criteria can then be derived. #### 4. OUTPUT TRAFFIC FROM A SWITCHING ELEMENT The output process of a first stage switching element is a D-MAP process [10] but the number of states of the Markov chain governing the evolution of this output process is too large for a second stage switching element to be easily tractable. In order to iterate the solution, let us approximate the output process from a switching element by a process that is simple: a "bursty geometric" process. It leads to a model of the whole switch. Let us first compute the interarrival time of a cell of "bursty geometric" source. A study of the interdeparture time of cells of a queue under different input processes (geometric, "bursty geometric", IBP) is then presented. #### 4.1 Study of interarrival time of cells in a "bursty geometric" process Let T be the interarrival time of a cell of a "bursty geometric" source. Let $T_i$ be the time interval from a slot in the state i of the Markov chain to the time to the next arrival. We get $T=T_1$ $$T_1 = \left\{ \begin{array}{cc} 1 & probability \ p \\ 1 \ + \ T_0 \end{array} \right. \quad \begin{array}{c} probability \ p \\ probability \ (1-p) \end{array} \quad T_0 = \left\{ \begin{array}{cc} 1 & probability \ (1-q) \\ 1 \ + \ T_0 \end{array} \right. \quad \begin{array}{c} probability \ (1-q) \\ probability \ q \end{array}$$ The z-transform of those probability distributions may then easily be obtained. $$T_0(z) = \frac{(1-q) z}{1-qz}$$ $T(z) = T_1(z) = pz + \frac{(1-p)(1-q) z^2}{1-qz}$ The limits of the derivatives of the z-transform may then be computed: $$T'(1) = \frac{2-p-q}{1-q}$$ $T''(1) = \frac{2(1-p)}{(1-q)^2}$ # **4.2 Study of the interdeparture times of cells (input processes are Bernoulli)** In the case when the input process is Bernoulli, the ouput process will be approximated by a "bursty geometric" process (parameters p and q) as follows - 1-p = Pr[L(t+1)=0/L(t)>0] (empty slot at time (t+1) given there was a cell at time t) - q = Pr[L(t+1)=0 / L(t)=0] (probability of two consecutive idle slots). For the Bernoulli input process, we obtain: $$1-p = b_0 \frac{\pi_1}{1 - \pi_0} \qquad q = b_0$$ Let T be the interdeparture time of cells. If the number of customers is greater than 1 (L>1), $\tilde{T}$ is equal to 1. Let $\widetilde{T}_i$ be the time interval from a slot in the state i of the Markov chain to the time to the next departure. We get $$\begin{split} \widetilde{T} = & \begin{cases} &\widetilde{T}_1 & \text{probability } \frac{\pi_1}{1 - \pi_0} \\ &1 & \text{probability } 1 - \frac{\pi_1}{1 - \pi_0} \end{cases} \\ \widetilde{T}_1 = & \begin{cases} &1 & \text{probability } 1 - b_0 \\ &1 + \widetilde{T}_0 & \text{probability } b_0 \end{cases} \qquad \widetilde{T}_0 = & \begin{cases} &1 & \text{probability } 1 - b_0 \\ &1 + \widetilde{T}_0 & \text{probability } b_0 \end{cases} \end{split}$$ Let us note that $\tilde{T}_0$ is equal to $\tilde{T}_1$ . $$\widetilde{T} = \begin{cases} 1 + \widetilde{T}_0 & \text{probability } \frac{\pi_1 b_0}{1 - \pi_0} \\ 1 & \text{probability } 1 - \frac{\pi_1 b_0}{1 - \pi_0} \end{cases}$$ The z-transform of these probability distributions can then be derived. $$\widetilde{T}_0(z) = \frac{(1-b_0)z}{1-b_0z} \qquad \qquad \widetilde{T}(z) = (1-\frac{\pi_1b_0}{1-\pi_0}) \ z + \frac{\pi_1b_0}{1-\pi_0} \ \frac{(1-b_0)z}{1-b_0z}$$ It follows that $$\tilde{T}(z) = pz + \frac{(1-p)(1-q)z^2}{1-qz}$$ Finally, $T(z) = \tilde{T}(z)$ . The interdeparture time of the queue has the same distribution as the interarrival time of the approximation. #### 4.3 Study of the interdeparture time of cells - Model 1 In this case, we shall approximate the output process of output queue $\mathbf{n}$ by a bursty geometric process of parameters $\mathbf{p}$ and $\mathbf{q}$ with: $$(1-p) = \frac{\Pr[L=1, \ B=0]}{1-\pi_0} \qquad \qquad q = \frac{\Pr[L=0, \ B=0]}{\pi_0}$$ Let us consider the interdeparture time $\tilde{T}$ of such a queue. Let $\widetilde{T}_{ij}$ be the time interval from a slot in the state (L=i, B=j) of the Markov chain to the time to the next departure. $\tilde{T}$ is equal to 1 if the number of cells is greater than 1 (L>1) or if there is one cell in the queue and if there is at least one input process in the "ON" state (L=1, B\ge 1). If there is one cell in the queue and no "ON" sources at time t, the queue will be empty at time (t+1). Let us note $$\alpha_{ij} = \Pr\left[B(t+1)=j / B(t)=i\right]$$ Consequently, $$\widetilde{T} = \begin{cases} & 1 & \text{probability} & (1 - \frac{Pr[L=1, B=0]}{1 - \pi_0}) \\ & \widetilde{T}_{10} & \text{probability} & \frac{Pr[L=1, B=0]}{1 - \pi_0} \end{cases}$$ $$\tilde{T}_{10} = 1 + \tilde{T}_{0i}$$ probability $\alpha_{0i}$ $$\widetilde{T}_{0i} = 1$$ if $i \ge 1$ ; $\widetilde{T}_{00} = \widetilde{T}_{10}$ The z-transform of these probability distributions can be expressed as follows: $$\tilde{T}_{00}(z) = \frac{(1-\alpha_{00})z^2}{1-\alpha_{00}z}$$ $$\widetilde{T}(z) = (1 - \frac{\Pr[L=1, B=0]}{1 - \pi_0}) z + \frac{\Pr[L=1, B=0]}{1 - \pi_0} \frac{(1 - \alpha_{00}) z^2}{1 - \alpha_{00} z}$$ Hence. $$\tilde{T}(z) = pz + \frac{(1-p)(1-q)z^2}{1-qz}$$ Consequently T(z) is equal to $\tilde{T}(z)$ . The interdeparture time of the queue has the same distribution as the interarrival time of the approximation. #### 4.4 Study of the interdeparture time of cells - Model 2 In Model 2, the input process into output queue n is not approximated. The queue will be empty at time (t+1) given there is at least a customer in the queue at time t if there is only one customer at time t and that no "ON" source emits a cell to the output n. The probability for having two following empty slots is equal to the probability that no cells are received given there were no customer in the queue. It follows: $$q = \frac{\sum\limits_{k=0}^{a} \; \left(1 \; - \; \frac{1}{s}\right)^k \; Pr[L=0, \; C=k]}{\pi_0} \qquad 1 \; - \; p = \frac{\sum\limits_{k=0}^{a} \; \left(1 \; - \; \frac{1}{s}\right)^k \; Pr[L=1, \; C=k]}{1 \; - \; \pi_0}$$ Let us note: - $\tilde{T}_{ij}$ the time interval from a slot in the state (i,j) of the Markov chain to the next departure. $b_{ij}$ the probability for having j "ON" sources at time (t+1) knowing there were i "ON" sources at time t $b_{ij} = Pr [C(t+1)=j / C(t)=i]$ - $\gamma_{ij}$ the probability that j cells are sent to the considered output knowing that i inputs are "ON" $\gamma_{ij} = Pr [A(t)=j / C(t)=i]$ $$\widetilde{T} = \begin{cases} 1 & \text{probability } 1 - \frac{\pi_1}{1 - \pi_0} \\ \widetilde{T}_{1j} & \text{probability } \frac{\Pr[L=1, C=j]}{1 - \pi_0} \end{cases}$$ $$\widetilde{T}(z) = (1 - \frac{\pi_1}{1 - \pi_0}) \ z + \sum_{j=0}^{a} \frac{Pr[L=1, \ C=j]}{1 - \pi_0} \ \widetilde{T}_{1j}(z)$$ $$\widetilde{T}_{1j} \!\!=\!\! \left\{ \begin{array}{rll} 1 & \text{probability } 1 \text{ - } \gamma_{j0} \\ \\ 1 \!\!+\! \widetilde{T}_{1i} & \text{probability } \gamma_{j0} \text{ b}_{ji} & (\widetilde{T}_{1j} \text{ } = \widetilde{T}_{0j}) \end{array} \right.$$ $$\widetilde{T}_{1j}(z) = (1 - \gamma_{j0}) \; z + z \; \gamma_{j0} \sum\limits_{i=0}^{a} \; b_{ji} \; \widetilde{T}_{1i}(z)$$ In order to obtain the limit of the first two derivatives of $\tilde{T}(z)$ in 1 ( $\tilde{T}'(1)$ and $\tilde{T}''(1)$ ), let us derive twice those z-transforms. It follows that: $$\begin{cases} \widetilde{T}'_{1j}(1) = 1 + \gamma_{j0} \sum_{i=0}^{a} b_{ji} \ \widetilde{T}'_{1i}(1) \\ \widetilde{T}''_{1j}(1) = \gamma_{j0} \sum_{i=0}^{a} b_{ji} \left[ 2 \ \widetilde{T}'_{1i}(1) + \widetilde{T}''_{1i}(1) \right] \end{cases}$$ $$\begin{cases} \widetilde{T}'(1) = \frac{1}{1 - \pi_0} \\ \widetilde{T}''(1) = \frac{2}{1 - \pi_0} \sum_{j=0}^{a} Pr[L=0, C=j] \ \widetilde{T}'_{1j}(1) \end{cases}$$ (The mean interdeparture time is equal to the inverse of the mean output rate) From the approximate output process, it follows, $$T'(1) = \frac{2-p-q}{1-q} = \frac{1}{1 - \pi_0}$$ But the second moment of the interdeparture time appears not to be the same as the second moment of the interdeparture time in the case when the output process is approximated, $$T''(1) = \frac{2(1-p)}{(1-q)^2} \neq \tilde{T}''(1)$$ Two approximations of the output process by a "bursty geometric" process are consequently considerd: the first one with the previous p and q (Model 2) and the second one with the following $\tilde{p}$ and $\tilde{q}$ parameters (Model 3) for which $T''(1)=\tilde{T}''(1)$ : $$1 - \tilde{q} = \frac{2(\tilde{T}'(1) - 1)}{\tilde{T}''(1)} \qquad 1 - \tilde{p} = \frac{2(\tilde{T}'(1) - 1)^2}{\tilde{T}''(1)}$$ #### 5. MODEL OF THE SWITCH An approximate model of the switch is proposed. We consider that flows between stages are "bursty geometric" processes. As the behavior of each stage k is independent from the behavior of all stages i with i greater than k, we first study the first stage and the results are used for the study of the second stage and so on. The switching elements at any stage k are assumed to be statistically independent [11]. As sources are supposed to be independent, this assumption is exact for the first stage. Consequently, we only study one switching element per stage. As the choice of the output port is random and equidistributed, we only study one queue in each stage. #### 6. RESULTS #### 6.1 Introduction The parameters that have to be studied are: the input load, the probabilities of transition p and q, or, which is equivalent L<sub>B</sub> and L<sub>S</sub>, the network topology and the memory size of the stages. Several parameters are choosen. The switch has got 128 inputs and 128 outputs. The number of inputs of switching elements in the first stage is 4. This size seems to be a good choice [Bey93]. But when stated differently, the global memory size in the switch is 128x72. The whole cell loss probability on the three stages and the cell delay across the whole switch are represented for the two analytical models and the simulation as a function of the memory size of the first stage, for different values of L<sub>B</sub> and L<sub>S</sub>, for many switch architectures. The memory size on the first stage varies and the best value for the memory sizes of the second and of the third stage is derived from the analytical model (the best value is the one that leads to the lowest loss probability). The points in the analytical models are validated by discrete events simulations. Confidence intervals are around 10-20%. For cell loss probability of 10-7 they are 20%. For cell loss probability higher than $10^{-6}$ they are around 10%. Let us note: An1, An2 = the analytical model 1 and 2; Simul = the simulation results. We intend to validate the analytical method and then to use it in order to choose the best architecture, the best memory repartition, for different loads and for different values of the burstiness. #### 6.2 Switch dimensioning for a 0.8 traffic rate C(128, 4, 8) means 128 inputs in the switch, 4 inputs and 8 outputs in a first stage switching element. Let us compare the performance of C(128, 4, 8), C(128, 4, 4) and C(128, 4, 16) architectures. This study is based on the following method: the global memory size M of the switch is kept constant. This value is $72 \times 128$ (N=128 is the number of inputs of the switch). There are three unknowns: the buffer sizes M(i) in each stage i. M=Nb/a\*M(1)+Nb/a\*M(2)+N\*M(3) (N=total number of inputs, b=number of outputs in the first stage switching elements, a=number of inputs in the first stage switching elements). #### 6.2.1 C(128, 4, 8) architecture Analytical models 1 and 2 give very good results, but they are slightly optimistic. Model 2 is better than model 1. (Fig 6 and 7) Figure 6. Switch cell loss probability and switch cell delay as a function of the memory size of the first stage for C(128, 4, 8), $\lambda$ =0.8, L<sub>B</sub>=5, L<sub>S</sub>=1.25 Figure 7. Switch cell loss probability and switch cell delay as a function of the memory size of the first stage for C(128, 4, 8), $\lambda$ =0.8, L<sub>B</sub>=100, L<sub>S</sub>=25 Parameters $L_B$ and $L_S$ have no influence on the performance criteria: cell loss probability and cell delay. (Let us note that they would probably influence the resequencing time). For the C(128, 4, 8) architecture the best memory placement is:M(1)=10; M(2)=10; M(3)=32, the best value for the cell loss probability is $10^{-7}$ . #### 6.2.2 C(128, 4, 4) architecture Figure 8. Switch cell loss probability and switch cell delay as a function of the memory size of the first stage for C(128, 4, 4), $\lambda=0.8$ , $L_B=5$ , $L_C=1.25$ Figure 9. Switch cell loss probability and switch cell delay as a function of the memory size of the first stage for C(128, 4, 4), $\lambda$ =0.8, L<sub>B</sub>=10, L<sub>S</sub>=2.5 The same conclusions may be derived for the C(128, 4, 4) architecture: the best memory placement is: M(1)=20; M(2)=24; M(3)=28. #### 6.2.3 C(128, 4, 16) architecture Figure 10. Switch cell loss probability and switch cell delay as a function of the memory size of the first stage for C(128, 4, 16), $\lambda=0.8$ , $L_B=5$ , $L_S=1.25$ Figure 11. Switch cell loss probability and switch cell delay as a function of the memory size of the first stage for C(128, 4, 16), $\lambda=0.8$ , $L_B=10$ , $L_S=2.5$ For this architecture, the same observations can be. The best memory placement is: M(1)=5; M(2)=6; M(3)=32. The cell loss is then $10^{-6}$ . ## 6.2.4 Analysis for three stage switch architecture of cell loss probability and cell delay as functions of memory repartition The performance criteria were studied for different architectures. It was shown that for a given traffic rate, the influence of $L_{\rm B}$ and $L_{\rm S}$ parameters is not important. C(128, 4, 8) architecture gives the best cell loss probability. For a global memory size of 72x128 and a traffic rate of 0.8 it is $10^{-7}$ for C(128,4,8), $10^{-6}$ for C(128,4,16) and $10^{-5}$ for C(128,4,4). The best cell delay (5.1 slots) is obtained for C(128, 4, 16) architecture. The cell delay is 5.5 slots for C(128, 4, 8) and 8 slots for C(128, 4, 4). #### 6.3 Comparison between model 2 and 3 Let us compare the results derived from models 2 and 3 for a C(128, 4, 8) architecture and a load of: $\lambda$ =0.8. In the previous section it appeared that model 2 was very good for the switch architecture C(128, 4, 8) and $\lambda$ =0.8. The two models give approximate results. In this section it appears that model 2 and 3 give the same performance results. Figure 12. Switch cell loss probability and switch cell delay as a function of the memory size of the first stage for C(128, 4, 8), $\lambda=0.8$ , $L_B=5$ , $L_S=1.25$ #### 6.4 Better loss rates for larger memory sizes, for a 0.8 traffic load Let us consider that the analytical methods were validated in the preceding sections. They can be applied with larger global memory sizes to get lower cell loss probabilities. We obtained the following results with a 108x128 global memory size: Figure 13. Switch cell loss probability as a function of the memory size of the first stage for Global memory=108x128, C(128, 4, 8), $\lambda$ =0.8 Figure 14. Switch cell delay as a function of the memory size of the first stage for Global memory=108x128, C(128, 4, 8), $\lambda$ =0.8. These best memory configurations do not change as a function of traffic parameters $L_{\mbox{\footnotesize{B}}}$ and $L_{\mbox{\footnotesize{S}}}.$ Table 1 summarizes the best performance results and the corresponding configurations for different traffic loads. ### 6.5 Best performance results of a C(128,4,8) switch for different loads The preceding performance results were studied for a traffic load of 0.8. For low loads, global memory is 72x128 is enough to obtain a $10^{-12}$ cell loss probability. See Table 1. | λ | M(1) | M(2) | M(3) | Global | Switch cell | Switch delay | |-----|------|------|------|---------|-------------|--------------| | | | | | memory | loss | | | 0.6 | 10 | 14 | 24 | 72x128 | < E-12 | 4.06 | | 0.7 | 10 | 12 | 28 | 72x128 | < E-12 | 4.54 | | 0.8 | 10 | 10 | 30 | 72x128 | 1.71 E-7 | 5.49 | | 0.8 | 14 | 16 | 48 | 108x128 | 8.26 E-11 | 5.44 | Table 1. Summary of the best performance results #### 7. CONCLUSION When studying the switch performance, the importance of memory placement among the 3 stages has been emphasized. The burstiness of sources has no influence on the optimal memory placement. It has not much influence on the transmission delay of the switch either. It has influence on the cell loss. It was shown that it is better to place most of the buffers on the last stage of the switch and to choose relatively small memory sizes for the first two stages. It was important to prove that when the input processes are "bursty geometric", the output process on each output link of a switching element are not far from "bursty geometric" processes. This process will be the input process of the next switching element. So the choice of "bursty geometric" input processes is partly validated. Loss rates and transmission delays derived from simulations are near the results obtained from approximated analytical models. This stands at least for the parameter values for which simulations may be performed. The different analytical methods give results that are not far one from the others. It is not surprising, because the main assumption is the independence and this was assumed for all the analytical methods. Prospective works will deal with: - the study of the burst length distribution, for the output process, - the correlations between the choices of the outputs for the cell of the same burst. - a way to take into account the dissymetry [12] and other bursty traffics (IBP, MMBP...) - validation of the solutions for lower rates leading to rare event simulations [13]. A lot of work has still to be done, but a first step was performed. #### REFERENCES - [1] CCITT, Draft Recom. I.121 "On the broadband aspect of ISDN", Seoul, February 1988. - [2] F. Tobagi "Fast Packet Switch Architectures for Broadband Integrated Service Digital Networks", Proceedings of the IEEE, vol 78, n°1, January 1990. - [3] D. Meliksetian, C.R. Chen "A MMBP Approximation for the Analysis of Banyan Networks", ACM Sigmetrics Conference, May 1993, Santa Clara California. - [4] T. Morris, H. Perros, "Performance Modelling of a multi-buffered Banyan Switch under Bursty Traffic", IEEE Infocom'92, paper 3D.2, Florence 1992. [5] A. Nilsson, F. Lai, H. Perros "An approximate analysis of a bufferless NxN - [5] A. Nilsson, F. Lai, H. Perros "An approximate analysis of a bufferless NxN Synchronous Clos ATM Switch", ITC'13, Copenhagen June 1991. - [6] H. Yamashita, H. Perros, S. Hong "Performance analysis of a shared buffer ATM switch under bursty arrivals", ITC'13, Copenhagen, June 1991. - [7] C. Clos "A study of non-blocking switching networks" Bell Syst. Tech. T.32 406-424, 1953 - [8] A.-L. Beylot, M. Becker "Performance Analysis of an ATM Clos Switch with nonsymmetric Switching elements and Output Buffers", ITC Seminar, Bangalore, Nov 1993 - [9] F. Bonomi, S. Montagna, R. Paglino, "Busy period for an ATM switching element output line", IEEE Infocom'92, Paper 4C.2, Florence 1992. - [10] C. Blondia, O. Casals, "Statistical Multiplexing of VBR Sources: A Matrix-Analytic Approach", Performance Evaluation Vol 16, n°1-3, pp 5-20, Nov. 1992. - [11] Y. Jenq "Performance analysis of packet switch based on single-buffered Banyan network", IEEE JSAC, vol. sac-1, n°6, December 1983. - [12] A-L. Beylot, I. Kohlenberg, M. Becker "Performance Analysis of three-stage Clos Interconnection Network under non-uniform Traffic Patterns" IFIP Conference Broadband Communications'94, Paris, March 1994. - [13] M. Becker, P. Douillet "Hierarchical Importance Sampling a Self-Adjusting Tool for the Simulation of Rare Events". ORSA/TIMS/SMAI Conference on Applied probabilities in Engineering, June 1993, Paris.