Keywords

1 Introduction

Up until recently, residential broadband has been the dominant service delivered over PONs. However, because PONs are an economically efficient means of providing high capacity bandwidth across wide rural and urban geographical areas [1], they are becoming increasingly attractive for the delivery of non-retail services such as LTE [2], where dedicated fibre capacity would not be economically feasible. PON sharing across multiple VNOs is also becoming increasingly important, in order to foster competition without replicating costly hardware infrastructure. Thus, in a multi-tenant scenario each VNO should have the ability to assign its portion of assured and best effort traffic to its customers, without affecting or being affected by other VNOs. Given that LTE, when implemented as Cloud Radio Access Networks (C-RAN) has a low tolerance to excessive packet jitter and latency, the VNO needs granular slicing and resource control in order to share PON infrastructure with non-critical services such as residential broadband.

Achieving separation between VNOs in a PON environment is complicated by the fact that upstream and downstream transmissions function in different manners. Unfortunately, granular resource control is only defined for upstream traffic in the PON standards, with downstream left to the devices of operators and whatever capabilities their vendor supplied equipment provide. By its nature, upstream transmission in a PON is a TDM channel where there is an inherent control over how much and when each Traffic Container (TCont) of each Optical Network Terminations (ONT) can transmit, down to the granularity of each 32-bit word [3]. In this context a TCont can be thought of as a dedicated queue at the ONT to which a specific class of service can be assigned. The scheduling of the upstream transmission is regulated by the Dynamic Bandwidth Allocation (DBA) mechanism [4], which allocates capacity to the ONTs located at the end user side [5]. One solution to provide full control to the VNOs over their upstream capacity allocation, by virtualizing and slicing the DBA mechanism, was recently proposed in [6], prototyped in [7] and standardized in [8].

Of most pertinence in this paper, is the downstream channel of the PON which functions in the manner of a statistically multiplexed packetized channel or shared bus. The packetization of data followed by statistical multiplexing through a shared transmission channel is an essential means of increasing the utilization of the channel, when compared to time or frequency division multiplexing means over the same channel. This technological trend is similar to how synchronous SDH and cell-based ATM networks in the core and metro networks are being replaced by multi-gigabit Ethernet [9] networks running over Dense Wave Division Multiplexing (DWDM) fiber [10], at times over an Optical Transmission Network (OTN) sublayer [11]. However, a drawback for real-time services that use a statistical multiplexed channel is the lack of a guarantee over the data information rate, packet loss and latency, which go together to define QoS. As central office infrastructure is moving towards virtualization and slicing [12], the ability to provide assurance over QoS is becoming increasingly important to provide slice independence across multiple tenants, as though they were being served by dedicated infrastructure. This is typically achieved through appropriate queue management, to separate traffic with different QoS requirements and policing the rate of incoming traffic. However, such methodology is not defined in PON standards, where decisions on how to implement downstream QoS are left to vendors and operators. Typically, the options vary between over-dimensioning capacity on each PON, executing vendor specific scheduling and policing, or shaping traffic prior to entering the Optical Line Termination (OLT) [2].

In this paper, we first review the current practices used by operators for the scheduling of downstream traffic in a PON, typically divided into one-stage and the two-stage scheduler architectures. We will show that while these methods can be effective in single-operator scenarios, they fail to provide the necessary support for VNOs sharing the same infrastructure. We thus propose a novel three-stage scheduler that includes a QoS stage for providing additional separation between VNOs. In order to make the case for the scheduler, we build the architecture for the one-, two- and three- stage schedulers in a simulator, which are then exercised using a common set of scenarios. The performance of the three simulators are then compared using typical metrics such as packet loss. Throughout the experiment, we are careful to distinguish between the packet handling capabilities for both High Priority (HP) and Low Priority (LP) traffic. Examples of HP traffic includes Signal data and traffic related to services that deteriorate significantly due to packet loss, such as Voice over IP (VoIP) [13]. Examples of LP traffic includes streaming Video and raw data, both of which can be buffered and replayed in the event of packet loss.

2 Downstream Scheduling

Typically, downstream PON QoS is implemented through queue management functions defined by Internet Engineering Task Force (IETF) Request For Comment (RFC)s. These functions include Strict Priority (SP), Weighted Round Robin (WRR), Weighted Fair Queuing (WFQ) and two rate three color marker (trTCM) which operate on network packet flows and which we distinguish as follows. A SP queue function has N input ports (numbered from 0 to N-1) and 1 output port. It will serve packets from port 0, if they are present, ahead of a lower priority (1 to N-1) port. Likewise, a SP queue will serve packets from port 1, if they are present, ahead of a lower priority (2 to N-1) port. While a SP can guarantee packets from higher priority ports to be delivered to the output port, there is no rate-limiting function that prevents the output port from being overloaded by lower priority packets. A WRR queue function has N input ports (numbered from 0 to N-1) and 1 output port. Packets are served from the input ports according to a weighted distribution function, typically denoted by φ (phi), and averaged over a unit of time. The downside of the WRR queue function is that it only distinguishes between packet counts and not packet sizes. A small number of very large packets served to low weighted port can skew the queue output in its favor. This is rectified in the WFQ. The weighted distribution function φ (phi) is defined in a similar way to that of the WRR queue function, except that it operates according to bytes offered to each port, as opposed to packets. The trTCM is an example of a Policer that can implement rate-limiting functionality. trTCM accepts flows of data on typically a single physical input port, which are then sent to a single output port. These flows can be passed through (logically tagged Green), dropped (Red) or remarked and passed (Yellow) depending on predefined data rate rules. Examples of the specific operation of the trTCM are given below. The mechanism by which these queue management functions can be composed into typical downstream QoS schedulers are shown in Fig. 1. Firstly, we see a single-stage QoS mechanism in Fig. 1(a), where QoS is only implemented at the ONT level, with WRR scheduling traffic within the same priority class, and SP operating across the classes to provide, for example, strict higher priority to voice and signaling with respect to video and other data types. Secondly, Fig. 1(b) shows an alternative two stages vendor solution. The first stage operates at the ONT level and is composed of 4 queues, a SP scheduler and trTCM policer. The SP scheduler handles traffic in the decreasing priority of signaling, voice and the weighted fair queuing balancing of video and data. The SP assigns a color to the traffic based on the source. Both signaling and voice are HP and thus marked as green. Video and data are instead LP and marked as yellow. Because the color of the traffic is already configured by SP, trTCM is configured in color-blind mode, according to RFC 4115 [14], and passes the colored traffic directly to the egress Committed Information Rate (CIR) and Excess Information Rate (EIR) ports respectively. A weighed fair queuing is applied at the ONT level between video and data, while strict priority is assigned to voice and signaling. trTCM policers enforce proportionality between CIR and EIR by marking the packets according to RFC2698 (trTCM) color definition [15].

Fig. 1.
figure 1

Architecture of single-stage (a) and two-stage (b) reference schedulers (Color figure online)

Packets that are marked as red are dropped, green packets are passed through, yellow packets are remarked packets that may be dropped at a later stage or passed if there is sufficient available capacity. At the OLT port level, a weighted round robin mechanism is applied to schedule packets across all ONTs, respecting aggregate levels of CIR and EIR. The second stage is the PON port module, composed of two WRR queues operating on the CIR and EIR queues. The weighting of the WRR queues is based on the total service profile, CIR and EIR respectively, that each ONT should carry as a proportion of the total capacity allowed in the PON port. For example, the weighting given to the packets from the ith ONT

$$ w_{i}^{c} = \frac{{CIR_{i} }}{{\mathop \sum \nolimits_{j = 1}^{k} CIR_{j} }},\,w_{i}^{e} = \frac{{EIR_{i} }}{{\mathop \sum \nolimits_{j = 1}^{k} EIR_{j} }} $$

Where \( w_{i}^{c} \) and \( w_{i}^{e} \) are the fractions of configured CIR and EIR for that ONT over the total CIR and EIR respectively for all (k) ONTs. The total weightings for each WWR then sums to 1.

$$ \mathop \sum \limits_{i = 1}^{k} w_{i}^{c} = 1,\mathop \sum \limits_{i = 1}^{k} w_{i}^{e} = 1 $$

It should be noticed that QoS scheduling becomes especially important in multi-tenant scenarios, where different VNOs can have different QoS contracts with their own customers and with the infrastructure provider. However, current scheduling mechanisms such as those described above, do not take into consideration the existence of multiple VNOs, and thus there is no QoS differentiation at the VNO level. In this paper we attempt to tackle this issue by proposing a novel three-stage scheduler that adds VNO-oriented scheduling layer and by comparing its performance improvement against the baseline mechanisms shown in Fig. 1.

Finally, we can define the behavior of the ideal scheduler, against which current and proposed architectures can be compared. To do this, we calculate \( X_{i}^{HP} \) and \( X_{i}^{LP} \) the expected HP and LP traffic egressing from the PON related to the ith ONT, given offered traffic \( Z_{i}^{HP} \) and \( Z_{i}^{LP} \). The high priority traffic marked as green should not exceeded the CIR allocated to that ONT. In the event, that HP traffic exceeds the CIR, the excess is recolored as yellow.

$$ G_{i}^{HP} = min\left( {Z_{i}^{HP} ,CIR_{i} } \right),\quad Y_{i}^{HP} = Z_{i}^{HP} - G_{i}^{HP} $$

Any excess CIR may be used to allow LP traffic to be colored as green. The excess LP traffic is marked yellow.

$$ G_{i}^{LP} = CIR_{i} - G_{i}^{HP} ,\quad Y_{i}^{LP} = Z_{i}^{LP} - G_{i}^{LP} $$

The expected HP traffic egressing from the PON related to the ith ONT is a sum of HP traffic marked as green and HP traffic marked as yellow contending for the total EIR EIRT. \( w_{i}^{c} \) and \( w_{i}^{e} \) are as defined above.

$$ X_{i}^{HP} = G_{i}^{HP} + w_{i}^{c} .EIR_{T} .\frac{{Y_{i}^{HP} }}{{Y_{i}^{HP} + Y_{i}^{LP} }} $$

Similarly, the expected LP traffic egressing from the PON related to the ith ONT is a sum of LP traffic marked as green and LP traffic marked as yellow contending for the total EIR \( EIR_{T} \).

$$ X_{i}^{LP} = G_{i}^{LP} + w_{i}^{e} .EIR_{T} .\frac{{Y_{i}^{LP} }}{{Y_{i}^{LP} + Y_{i}^{LP} }} $$

In the next section we explain the architecture of our proposed three-stage QoS scheduler and the implementation of the simulation environment. This allows us to compare the results of our scheduler with the one-stage and two-stage schedulers described above, showing the advantages brought by our three-stage design for multi-tenant operations.

3 Three-Stage Scheduler Description and Architecture

The architecture of our Hierarchical three-stage scheduler is shown in Fig. 2, and uses the same array of functions (SP, WRR, WFQ and trTCM) as the one- and two- stage schedulers. We note that in recent years the IETF has developed algorithms and techniques for Queue management and schedulers, such as Controlled Delay (CoDel) and Active Queue Management (AQM) respectively that deal with anomalous treatment of TCP/IP transport flows due to excess buffering of packets. This anomalous behavior is a result of a phenomenon known as Bufferbloat [16], that is, the use of excessively large buffers in routers, switches and modems, that disrupt how normal functioning of TCP/IP flow control mechanism. IETF has combined CoDel and AQM into a single RFC (8290) known as Flow Queue CoDel Packet Scheduler and Active Queue Management Algorithm (FQ-CoDel) [17]. With respect to the reference two-stage scheduler shown Fig. 1(b), we have added a further WRR followed by a trTCM function that operate at the individual VNO level. The scope of this additional stage is to further partition the overall capacity, so that any excess capacity from a VNO is preferentially re-distributed across the ONTs belonging to the same VNO. Hierarchical Scheduler is located as a single functional block at the head end of the PON, that is at the OLT. It functions at 3 levels, that is, at the level of the ONT, at the level of the Operator and at the level of the PON, where WRR, SP and WFQ assure proportionality between contracted bandwidths. Downstream from the ONT and Operator schedulers, trTCM policers enforce proportionality between CIR and EIR by marking the packets according to RFC2698 (trTCM) color definition [15]. Packets that are marked as Red are dropped, Green Packets are passed through, Yellow packets are remarked packets that may be dropped at a later stage or passed if there is enough bandwidth. In an ideal scenario, but not implemented here, a feedback path from the PON functional block, similar to Call Admission Control (CAC), directs specific ONTs to block new flow requests, where the CIR is higher than a specific percentage (typically 50%) of the total capacity of the PON or indeed should the CIR of an operator exceed a specific threshold agreed with the PON infrastructure provider.

Fig. 2.
figure 2

Architecture of the proposed three-stage QoS scheduler (Color figure online)

4 Simulator

In order to compare the QoS performance improvement of the three-stage scheduler in PON multi-tenant scenarios we have developed a QoS discrete event simulator based on the Python SimPy [18] package. We use several fundamental QoS modules (trTCM, WRR, WFQ and SP) to construct the downstream PON schedulers operating in the OLT. The trTCM object class marks packets with the three colors: green, yellow and red, depending both on any original marking and on the CIR and the EIR parameters. It should be noticed that the decision on whether incoming data is above or below the CIR and EIR thresholds is computed based on average values across a given burst data size. In the simulation we used burst sizes values for the CIR and the EIR typically configured in Cisco and Juniper routers of 64 Kbytes and 128 Kbytes, respectively. The WRR module accepts packets from several inputs streams and assures fairness based upon a predefined distribution based on number of packets in the input buffer for each stream. WFQ behaves similarly to WRR but uses number of bytes in the input buffer for each stream as a means of determining the fairness. Finally, SP selects packets from different queues in ascending order of priority and being strict it will only select packets from a lower priority queue once the higher priority ones are empty.

The ONT module is made up of 4 queues each with a buffer size of 128 Kbytes, a SP scheduler and trTCM policer. The SP scheduler handles traffic in the decreasing priority of signaling, voice and the weighted fair queuing balancing of video and data. The SP assigns a color to the traffic based on the source. Both signaling and voice are HP and thus marked as green. Video and data are instead LP and marked as yellow. Because the color of the traffic is already configured by SP, trTCM is configured in colorblind mode, according to RFC 4115 [14], and passes the colored traffic to the egress CIR and EIR ports respectively. This is constructed similarly to the ONT QoS module, however there is a pair (that is for both CIR and EIR) of WRR schedulers for each VNO, whose weightings are calculated based on the total capacity assigned to the specific VNO. Thereafter a trTCM policer colors the aggregate traffic from the SP scheduler as passed (green), remarked (yellow) or dropped (red). The PON port QoS module is composed of two WRR queues whose weighting are based on the aggregate service profiles, CIR and EIR respectively, that each other operator carries. The PON QoS module does not have a policer, but instead must aggregate both CIR and EIR onto the common downstream physical link. To assess the performance of the Hierarchical QoS (H-QoS), we also emulate one and two stage QoS architectures. The one stage architecture is composed of just the PON module from the three-stage architecture, which has been extended to have 4 WRR queues, terminating the Signaling, Voice, Video and Data links. The weighting of the WRR queues is based on the total service profile CIR and EIR respectively that each other ONT carries as a proportion of the total for the PON.

The two-stage architecture Fig. 1(b) is composed of the ONT and PON modules from the three-stage architecture Fig. 2, omitting the Operator QoS modules. The ONTs connect directly to the PON. The weighting of the PON WRR queues is based on the total service profile CIR and EIR respectively that each other ONT carries as a proportion of the total for the PON.

5 Simulation Scenarios and Results

In our simulation, we assume that there are 2 VNOs which share a common PON downstream capacity of 2.488 Gbps [3], of which 1.760 Gigabits per second (Gbps) is allocated to CIR. We define a service profile as a fixed mix of committed and excess information rates that may be assigned by any VNO to one of its ONTs. We denote this mix as a tuple (CIR, EIR). In our simulations, we define two service profiles, Profile-1 (10,100) and Profile2 (100,1000), measured in Megabits per second (Mbps). In order to gauge the ability of the QoS mechanisms described to schedule capacity appropriately across VNOs we introduce an imbalance between their user base. Operator A has 24 ONTs, of which half are Profile-1 and half are Profile-2.

Operator B has 4 ONTs with Profile-1 and 4 with Profile-2. The net result is that Operator A carries three quarters of the total traffic and Operator B carries one quarter. We are especially interested in what happens to bandwidth which is offered to an ONT that is either in excess or in deficit of its service profile’s CIR and EIR, which we call the Nominal information rates. For this purpose, traffic offered at each ONT is transmitted at either +20% or −20% of the nominal CIR or EIR for the configured service profile. For instance, ONTs with service profile Profile-1 may be offered (8, 80), (8, 120), (12, 80) or (12, 120) Mbps. Finally, in addition to providing a comparison between our three-stage QoS mechanism and the two other baseline algorithms described in Sect. 1, we also provide the output of an ideal scheduler as described previously, under the same conditions, that is, whose output is computed mathematically rather than simulated. This allows use to assess the three QoS architectures against an ideal scheduler.

The results of our simulations are reported in Fig. 3 showing the throughput of the different schedulers for HP and LP traffic. The x axis represents the different ONTs, which are identified by the triplet (x:y:z), where x identifies the VNO (A or B), y is the (CIR/EIR) profile (i.e., Profile-1 or Profile-2, as described in the previous section) and z is the ONT identifier. To simplify the plot, we only show the results for a subset of ONTs, which is representative of the performance provided by the three schedulers. In the Fig. 4(a) plot, we see that both the proposed three-stage (in orange), the baseline two-stage (in gray) and the baseline one-stage (in yellow) algorithms follow the ideal output (shown in blue) quite closely. This is to be expected since the purpose of CIR is to strictly cap the capacity at the agreed levels, which can be easily enforced already at the ONT level.

Fig. 3.
figure 3

Comparison of (a) High Priority (HP) and (b) Low Priority (LP) traffic (Color figure online)

Fig. 4.
figure 4

Variance of (a) HP data and (b) LP data against expected ideal behavior (Color figure online)

As we see in Fig. 3, where the three-stage mechanism provides considerable improvements over the two-stage and the one-stage is in the handling of LP traffic. Since the EIR traffic is only scheduled if there is spare capacity, our additional VNO-level QoS can reassign such capacity appropriately within each VNO.

The three-stage algorithm provides a much closer output to the ideal scheduler both across ONTs and across VNOs, showing a maximum deviation of ±5%, compared to the two-stage scheme that deviates between ±25%. This accuracy is highly desirable in a multi-tenant Fixed Access Network Sharing (FANS) scenario [19], where each VNO pays the infrastructure provider for assuring a given level of aggregate CIR and EIR. The one-stage shows a ±26% deviation, however this on the basis that video, and data share the same WRR queue. If this were the not case, then the video queue would always be served at the expense of the data (least priority) queue in this scenario. Figure 4 shows the capacity variance for HP traffic (top graph) and for LP traffic (bottom graph) with respect to the ideal case for the three different scheduling algorithms (labeled three-stage, two-stage and one-stage).

The different bars group the results for all ONTs that offer, respectively, 8, 12, 80 and 120 Mb/s of HP traffic (the first two rates are for ONTs in Profile-1 with a 10 Mb/s CIR and the second two rates for ONTs in Profile-2 with 100 Mb/s of CIR). We see that the proposed three-stage architecture provides the lowest variance for all traffic levels, especially for the allocation of capacity to LP traffic. Finally, the one-stage architecture performs poorest with a deviation from the desired output between +80% and +200%.

6 Conclusions

In this paper, we have presented a three-stage scheduling architecture that can implemented at the head-end of the PON or the OLT. This has several benefits. Firstly, the OLT has full knowledge of the traffic that is being offered to the PON and provides real-time insight to network planners as to how the network is complying with contracted QoS metrics. Indeed, the three-stage scheduler is in the best position to react to changing network demands. Secondly, while the data-path of the three-stage scheduler may be implemented in high-speed Application Specific Integrated Circuit (ASIC), the parameters of the WRR, WFQ and trTCM can be dynamically tuned at the discretion of a centralized Software Defined Network (SDN) control plane or orchestrator. Thirdly, the centralization of the scheduler simplifies the functionality of the ONT, allowing for easier functional upgrades, and virtualization of the PON.

We have assumed that the CIR for each operator is equal to the sum of the CIR of all its’ customers. However, in our future work, we will analyze where the CIR is less than the sum of the CIR of all its customers (over-subscription) as well as greater than the sum of the CIR of all its customers (under-subscription).