1 Introduction

In the last years, smartphones and other mobile devices have emerged as one dominant technology for daily access to Internet services [6]. This, combined with the ever increasing broadband access supplied by operators has triggered pervasive demand on video streaming mobile services. In turn, this requires the exploration of novel approaches on video content delivery. To afford video streaming services at sustainable costs, the idea of adjusting the bit rate of video traffic depending on the (time-varying) available bandwidth has been actively investigated during the recent years. This technique is commonly referred to adaptive streaming technology. At the industrial level, many adaptive video streaming solutions exist. They are now undergoing a standardization process under the Dynamic Adaptive Streaming over HTTP (DASH) initiative. DASH will include existing solutions such as Microsoft’s smooth streaming, Adobe’s HTTP dynamic streaming and Apple’s live streaming [7]. In order to fully exploit the potential of DASH, though, new challenges arise for content providers, operators and device manufacturers. One of such challenges is the need to accurately assess users’ Quality of Experience (QoE) in order to enhance service provisioning and optimize adaptation to network conditions.

Actually, the key concept in DASH is to dynamically adapt the video quality to the network bandwidth. This is done in order to cope with multiple playback interruptions. Those are likely to occur when the video quality is kept the same during the whole video session irrespective of possibly highly variable network conditions, e.g., those typical of mobile wireless connections. In DASH, a single video file is divided into smaller chunks of fixed playback duration called segments. Each segment is encoded at various bitrate levels (called representations). This is done using a specific compression algorithm or codec (e.g., H264/AVC). Then, given the available network bandwidth, a segment is selected with the appropriate bitrate. With the Scalable Video Coding (SVC) compression algorithm (extension of Advanced Video Coding), the video source is encoded in one base layer (BL) and one or more optional enhancement layers (ELs), as depicted in Fig. 1. The base layer is always provided. Then, given the available network bandwidth, the client adaptation engine adds the appropriate number of enhancement layers in order to improve the video quality/SNR, resolution and frame rate.

Fig. 1.
figure 1

Segments encoding with SVC

The design goal of DASH is to simultaneously obtain high performance over different key metrics including buffering delay, playback interruptions, average bitrate (video quality) and temporal variability of streaming quality. However, in an environment subject to highly variable throughput, attaining high performance across all these metrics is still considered a great challenge. In this paper, we propose a novel bitrate adaptation scheme which is based on our Backward Shifted Coding (BSC) introduced in [12]. This BSC system makes HTTP Adaptive Streaming (HAS) more robust to rapid fluctuations of the network capacity and provides more flexibility in increasing the quality of video without playback interruptions. The basic idea of BSC is to shift the base layer and its enhancement layers so that when an interruption of playback buffer occurs, the base layer can still be played.

In this paper, we incorporate a new version of BSC in HTTP adaptive streaming. Doing so we are able to strike the balance between responsiveness and smoothness in DASH. More in detail, this new version of BSC contains two layers: the low layer segment, which delivers only the base layer or the base layer with a minimal set of enhancement layers, and the top layer segment that contains only enhancement layers. During the video transmission, the two segments are shifted in time. Hence, our main focus in the following is the adaptation problem, i.e., how to jointly match the video quality of each layer (low layer and top layer) of the two shifted segments to the network conditions. Our proposed adaptation methods select the appropriate bitrates for both segments by adding the appropriate number of enhancement layers. Through extensive simulations we show that this BSC system performs remarkably well even under high throughput variability. This is due to the key property of this novel scheme. In fact, the DASH protocol can leverage on the time difference of the two BSC layers, which increases diversity. In turn, this mitigates the impact of inaccurate capacity estimation on HAS. In [13], we did simulations using generated throughput traces from Matlab and HSDPA throughput traces. In this paper, we use an MPEG/DASH client-server application on ns-3 simulator using LTE (Long Term Evolution) network.

The outline of the paper is as follows. In Sect. 2 we describe the Backward Shifted Coding system and its mapping to the DASH system. In Sect. 3, we give the details of the bitrate adaptation in Backward Shifted Coding including the pseudo-codes of our proposed adaptation algorithms. Section 4 presents the simulations framework and the numerical results. Finally, Sect. 5 concludes the paper.

2 System Model

2.1 Backward-Shifted Coding

We provide an overview of Backward-Shifted Coding system and briefly describe its integration in Dynamic Adaptive Streaming over HTTP standard. The BSC scheme is fully client driven. The main idea of the scheme is to send a complete segment (base layer and possibly some enhancement layers) together with the enhancement layers of another segment. We call the first one, lower layer segment and the second one is the top layer segment. During the video transmission, the two segments are shifted in time by a constant offset. We denote \(\phi \) the offset between the two segments. Thus, each segment k has its enhancement layers in segment \(k+\phi -1\) (Fig. 2). We call block k the combination of segment \(k+\phi -1\) (lower layer) and of enhancement layers of segment k (top layer). Therefore, should the enhancement layer be missed, the player can still playout the lower layer segment which is sent in advance with low quality. The advantage of the BSC scheme is apparent if we consider the decoding operations at the user side, i.e., when incoming bits are reassembled into video frames by the decoder. The advantage compared to the basic SVC scheme in Fig. 1 is that in plain SVC, when lower layer segment k is transmitted, it is decoded to render the segment with a given quality. Later, if other enhancement layers of this segment are received, the segment is decoded again to increase its quality. BSC does not need to perform repeated decoding since each block is received only once, i.e., base layer and related enhancements layers.

Fig. 2.
figure 2

Segments transmission with Backward-Shifted Coding: the lower layer segments contain the base layer (and possibly some enhanced layers) and are transmitted before the corresponding top layer segments, which follow after \(\phi -1\) blocks; the initial \(\phi \) blocks carry only lower layer segments; the notation \(BL \rightarrow EL_j\) indicates all segments \(BL,EL_1,EL_2,\ldots ,EL_j\) and \(EL_i \rightarrow EL_j\) indicates \(EL_i,EL_{i+1},\ldots .,EL_j\)

The BSC scheme can be naturally adapted to DASH: under DASH/SVC, video servers store each tagged video into segments. For multi-layer codecs, such segments consists of a base layer and multiple enhancement layers. BSC requires to compound layers and to defer the transmission of top layer segments. Conversely, bitrate adaptation algorithms have not been standardized yet in DASH. The aim is to choose a bitrate ensuring good video quality and prevent video playback interruptions. They fall into two categories: the throughput-based approaches and the buffer-based approaches. Some schemes [10, 14] may actually fall in both categories since they leverage on the estimation of the network throughput in combination with buffer-based mechanisms.

The main idea behind throughput-based schemes is that the MPEG-DASH client performs an estimation of the available bandwidth for the requested segments [1, 9]. Then, based on the network throughput and the playout buffer occupancy level, an adaptation engine chooses the highest possible bitrate compatible with the available throughput in order to avoid possible playback interruptions. The simplest way to estimate the available throughput is to compute the segment throughput after it is completely downloaded. This is a standard throughput measure called instant throughput [10]. This method is simple and fast to react to the throughput variations but not accurate. Conversely, buffer-based methods leverage on the size of the buffer, with the aim of keeping it at a given nominal level. In comparison to the throughput-based schemes where the bitrate selection is based on the estimated throughput, buffer-based methods leverage on the size of the buffer, with the aim of keeping it at a given nominal level. In this context, the adaptation engine for BSC sets two segments at different bitrates, i.e., one bitrate for the low layer segment (base and enhancement layers) and one bitrate for the top layer segment which contains only enhancement layers.

2.2 Video Bitrate Adaptation

In this section, we develop a video rate adaptation algorithm suitable for the Backward-Shifted Coding scheme. In Backward-Shifted Coding, the media segments are encoded using H264/SVC (or equivalent multi-layers codec). As shown in Fig. 3, block k contains segment \(k+\phi -1\) (lower layer segment) and enhancement layers of segment k (top layer segment). Each time a user requests the video, a HTTP connection is established with the server. The video blocks are downloaded into a playback buffer, which contains downloaded segments but are not yet displayed by the playout. As shown in Fig. 3, after block k is downloaded, segment k can be decoded using the lower layer segment from block \(k-\phi +1\) and the enhancement layers from block k.

Fig. 3.
figure 3

Decoding of segment k: uses lower layer segment of block \(k-\phi +1\) (containing base layer and possibly some enhancement layers) and top layer segment of block k (containing enhancement layers only).

Let N be the number of segments contained in the video file. Each segment contains L seconds of video and it is encoded at different bitrates.

In standard SVC playout, a set of available bitrate levels per segment \(\mathcal {R}\) corresponds to selecting the base layer and a certain number of enhancement layers. In the BSC system, the playout downloads the BSC block k with the bitrates \((R_{k,E}, R_{k,B}) \in \mathcal {R}^2\). In particular we denote:

  • \(R_{k,E}\) is the bitrate of segment k by including the lower layer segment, which is received through block \(k-\phi +1\)

  • \(R_{k,B}\) is the bitrate of the lower layer segment \(k+\phi -1\) (which contains base layer and some enhancement layers).

Note that, with this notation, when we refer to the condition \(R_{k,E} = R_{k-\phi +1,B}\), we mean that no enhancement layers are transmitted in block k.

3 Adaptation Methods in BSC

The goal of the bitrate adaptation is to maximize the quality of experience of the video streaming user depending on four key parameters: the startup delay, the playback interruption, the mean video bitrate and the bitrate switching.

We propose bitrate adaptation methods to choose the suitable bitrates for block k. We denote by \(R_{min}\) and \(R_{max}\) the smallest and the highest bitrate respectively in the set of available bitrates \(\mathcal {R}\). We let \(\mathcal {B}_k\) be the current playout buffer occupancy measured in seconds of video content.

In order to select bitrates \(R_{k,B}\) and \(R_{k,E}\), we are inspired from the two approaches described in Sect. 2, namely the buffer-based and the throughput-based approach in order to evaluate the performance of the BSC scheme. This results into two distinct algorithms: the throughput-based BSC algorithm (TB-BSC) and the buffer-based BSC algorithm (BB-BSC).

3.1 The Throughput Based Approach

We distinguish two cases based on the block index: \(k < \phi \) and \(k \ge \phi \).

Case \(0\le k \le \phi - 1 \) . For the \(\phi -1\) first blocks, each block contains (1) the whole (lower layer) segment k and (2) the lower layer segment \(k+\phi -1\) but at minimum bitrate \(R_{k,B}=R_{min}\). Thus, for the first \(\phi -1\) blocks, the bitrate adaptation concerns only the whole segment k and must be operated such in a way that \(R_k + R_{min} \le \hat{A}_t\) where \(R_k\) is the bitrate of the whole segment k.

By assigning a minimum bitrate, \(R_{min}\), to the lower layer segment \(k+\phi -1\), the startup delay is not greatly affected by the BSC scheme. Doing so, we immediately maximize the bitrate of the segments \(1\le k \le \phi -1\) – for which no enhancement layers are expected later on – and we defer the bitrate enhancement of the lower layer segments \(\phi \le k \le 2 \phi -2\) using the top layer segment carried by block \(k+\phi -1\).

Case \(k \ge \phi \) . It is interesting to observe that, in our TB-BSC scheme, we shall also leverage on information on the buffer level occupancy.

Let \(\phi _{t}=\phi \cdot L\): it represents the offset in seconds between the lower layer segment and its enhancement layers. When the buffer size (in seconds) is not larger than \(\phi _{t}\), we no longer need to send the enhancement layers segments because their corresponding segments are already been played by the playout. In that case, the bitrate selection is equivalent to DASH/SVC. When \(\mathcal {B}_k > \phi _{t}\) (Line 3), the adaptation is done on both the lower layer segments and the enhancement layers segments. The pseudo-code for this part of the TB-BSC adaptation algorithm is provided in Algorithm 1. We assume that it is invoked repeatedly each time t a block is downloaded; it starts immediately after the download of BSC block \(k-1\) is completed.

In the worst case, i.e. Line 4, when the estimated throughput is lower than \(R_{min}\), the selected bitrate for the lower layer segment in block k is \(R_{min}\) and no enhancement layers are sent, i.e., \(R_{k,E}=R_{k-\phi +1,B}\).

We denote by \(R_{t-}\), the highest available bitrate compatible with the estimated throughput. In the same way, \(R_{t+}\) is the smallest available bitrate regarding the estimated throughput.

When the estimated throughput is lower than the bitrate of the lower layer segment in the previous block \(k-1\), the bitrate of the lower layer segment in the next block k is set to \(R_{t-}\). And the bitrate of the enhancement layers segment in the next block k is the maximum between \(R_{t+}\) and \(R_{k-\phi +1,B}\) (Lines 8 and 9, respectively). It is worth remarking that in this case, the selected bitrate for the lower layer segment is not larger than the estimated throughput in order to prevent playback interruptions. But, we observe that the bitrate of the enhancement layers segment is larger than the estimated throughput. Indeed, we do not risk playback interruptions here: in fact the buffer level is large enough (\(\mathcal {B}_k > \phi _{t}\)).

figure a

When the available throughput increases compared the previous block (Line 10), we increase the bitrate in a smooth manner in order to avoid sudden video quality transitions [2]. In practice, when the estimated throughput is higher than the bitrate of the lower layer segment in the block \(k-1\), the selected bitrate of the lower layer segment in the block k is increased to a higher bitrate, i.e., \(R_{k,B}=R_{k-1,B}^{\uparrow }\), (Line 11). The bitrate of the enhancement layers of block k is increased to a higher bitrate as well (Line 12).

3.2 The Buffer Based Approach

The use of buffer occupancy to select the segments’ bitrate is a technique used by several schemes in the literature [8]. Typically, buffer thresholds are set (either two or three thesholds) and decisions on the bitrate are taken according to the level of current buffer occupancy with respect to such thresholds. Some of these methods use also the estimated throughput to smooth bitrate variations. Let us call BBA-0 this group of bitrate adaptation methods.

A second group of buffer-based algorithms employ an adjustment function in order to pick the appropriate bitrate [4, 5]. Let us call them BBA-1: compared to BBA-0, they do not perform throughput estimation, thus avoiding the related estimation errors. This method for bitrate selection is the basis of our BB-BSC algorithm. We describe first the application to BSC of the template algorithm introduced in [5], shortly BBA-1. Then we specialize it to match the specific features of BSC and derive BB-BSC. BB-BSC will be finally composed of two procedures, one for the lower layer segments and one for the top layer segments. The lower layer segment algorithm is reported in Algorithm 2.

We have two buffer thresholds r and c where r is the reservoir and c is the cushion in seconds of video content. The bitrate selection is based on an adjustment function F [14] where \(F(\mathcal {B}_k)=R_{min}\) for \(\mathcal {B}_k \le r\), \(F(\mathcal {B}_k)=R_{max}\) for \(\mathcal {B}_k \ge r+c\) and \(F(\mathcal {B}_k)=R_{min}+\frac{\mathcal {B}_k-r}{c}(R_{max}-R_{min})\) otherwise. Then, given the current buffer occupancy \(\mathcal {B}_k\), \(F(\mathcal {B}_k)\) is computed to select the bitrate of the next segment. Our purpose is to increase the video quality and limit the number of quality variations. We do this in two steps. First, we remark that when using BBA-1 algorithm on the lower layer segments in BSC system with the adjustment function F, we still have a margin which can be used to add enhancement layers segments while avoiding playback interruptions. Therefore we define two adjustment functions \(F_1\) and \(F_2\). The two functions have the same formula as function F but differ in the value of c, i.e., \(F_1\) uses \(c_1\) and \(F_2\) uses \(c_2\) (Fig. 4). Given the values of \(c_1\) and \(c_2\), we can increase and decrease the margin between the two curves and then adjust the amount of enhancement layers segments we add to the lower layer ones. Then, we compute the bounds of the previous bitrate (\(R_{+}\) and \(R_{-}\)) and the adjustment function \(F_1\) regarding the buffer occupancy \(\mathcal {B}_k\). The bitrate of the next lower layer segment is selected according to \(F_1(\mathcal {B}_k)\) and the buffer occupancy.

Fig. 4.
figure 4

The adjusment functions for the lower layer segments and the top layer segments: rates above the curve are risky for buffer depletion, rates below the curve are safer but correspond to lower quality.

figure b

Smoothing the bitrate variability. The main purpose of enhancement layers segments is to improve the quality of the video. They do not increase the video content in the buffer in terms of playout time. We use the adjustment function \(F_2\) to select the bitrate of the enhancement layers segments as we did with the lower layer segments. Since \(F_2 \ge F_1\), we will increase the video quality. But we also want to reduce the quality variations. For this purpose, we will apply the algorithm not on a single enhancement layer segment, but on a set of blocks of enhancement layers segments of length \(\phi -1\).

An example of this smoothing procedure is reported in Fig. 5 for \(\phi =4\). The algorithm is applied on blocks of 3 consecutive enhancement layers segments. The red part represents the lower layer segments. After the download of top layer segment 3, the output of the algorithm is \(R_i\) (the green bar). That means, we have to download the necessary enhancement layers of segment 4, 5 and 6 to reach \(R_i\). These enhancement layers will be downloaded on lower layer segment 7, 8 and 9, respectively.

The algorithm is invoked after a set of blocks of length \(\phi -1\). Then, when the algorithm is invoked after the download of block \(k-1\), the output remains the same for the next \(\phi -1\) BSC blocks. For the algorithm of the lower layer segments, we compare \(F_1(\mathcal {B}_k)\) to the bounds of the bitrate of the previous lower layer segment. Here, we compare \(F_2(\mathcal {B}_k)\) to \(r_{avg}^{+}\) and \(r_{avg}^{-}\). \(r_{avg}^{+}\) (\(r_{avg}^{-}\)) is the bitrate of each segment in the previous set of blocks of length \(\phi -1\) to reach \(R_+\) (\(R_-\)) where \(R_+\) (\(R_-\)) is the top (lower) bound of the previous bitrate \(R_{k-1,E}\). In other words, \(r_{avg}^{+}=R_+\) and \(r_{avg}^{-}=R_-\). Then, we compute the bounds of the previous bitrate and the adjustment function \(F_2\) corresponding to buffer occupancy \(\mathcal {B}_k\). The bitrate of the next enhancement layers segment is selected according to \(F_2(\mathcal {B}_k)\) and the buffer occupancy.

Fig. 5.
figure 5

(a) Example of application of the top layer algorithm: applied on several segments simultaneously it smooths the bitrate variability (b) Effect at the decoder side; \(\phi =4\). (Color figure online)

4 Simulations

We evaluate the bitrate adaptation in TB-BSC and BB-BSC using ns-3 and Matlab. We compare BSC bitrate adaptation algorithms with their equivalent DASH SVC algorithms. The network capacity is given by LTE (Long Term Evolution) simulations. The number of segments in the video file is N, the segment duration is L (seconds), BSC scheme offset is \(\phi \) and the set of available bitrates \(\mathcal {R}\).

4.1 Simulation Setup

We have performed LTE network simulations using ns-3 [3]. The network topology consists on four macro cells where the cell radius is \(\rho =1500\) m. The cells deployment is shown in Fig. 6. We consider multiple cells so as to account for cell interference. The eNodeB transmission power is 46 dBm. The signal propagation path loss model is COST231 with pedestrian fading model. The MAC scheduler is the well known Proportional Fair (PF) scheduler. The full simulation parameters are described in Table 1. The MPEG/DASH client-server application from [11] is installed on the UEs and the remote DASH video server.

Fig. 6.
figure 6

DASH system architecture in LTE network

Table 1. LTE simulation parameters

4.2 Numerical Results

The set of experiments compares the requested bitrate with TB-BSC, BB-BSC, BBA-0 and BBA-1 algorithms. The file size in the experiments is up to 250 s of video while the playback frequency is 25 frames per second (fps). We consider the following set of available bitrates \(\{\)140, 250, 420, 760, 1000, 1500, 2100, 2900\(\}\) (Kbps). The video segment duration is set to 2 s.

Fig. 7.
figure 7

Requested bitrates for TB-BSC and TB-SVC for smooth throughput estimation method

Fig. 8.
figure 8

Requested bitrates for BB-BSC and BBA-1, \(\phi =10\)

Fig. 9.
figure 9

Requested bitrates for BB-BSC and BBA-1, \(\phi =2\)

Fig. 10.
figure 10

Comparison of BBA-0, BBA-1 and BB-BSC algorithms

In Fig. 7, we plot the requested bitrates for TB-BSC and TB-SVC, i.e., the throughput based algorithm, for \(\phi =4\). The throughput is estimated in a smooth manner to select the next bitrate. We observe that BSC outperforms a bit SVC in terms of video quality but with too much bitrate variations. We resort to the buffer based method to stabilize the bitrate variation. In Fig. 8, we plot the requested bitrates for BB-BSC and BBA-1 for \(\phi =10\) and the following buffer thresholds: \(r=20\), \(c_1=70\) and \(c_2=50\). The video bitrate is 716.32 Kbps for BB-BSC and 667.04 Kbps for BBA-1. Further, BB-BSC shows more robustness in terms of bitrate variability: the number of bitrate variations is 8 for BB-BSC against 18 for BBA-1. The same experiment for \(\phi =2\) is shown in Fig. 9. We observe that the number of bitrate variations increases untill 20. This shows the importance of the offset \(\phi \) in BSC system.

The first observation is that it is difficult to have a value of \(\phi \) which optimizes all the metrics of the quality of experience. So, we must find a tradeoff. The risk of the video playback interruption is really high for \(\phi \) between 25 and 55. That corresponds to an offset of 50 s and 110 s duration, respectively. We must also avoid the values of \(\phi \) such as \(\phi \ge \frac{K}{2}\) (62 in this example). Indeed, for these values of \(\phi \), the video quality decreases and the number of quality switching increases. In this experiment, a good tradeoff is achieved for \(\phi ~\in ~\{10,\ldots ,25\}\). This range corresponds to \(\{10,\ldots ,50\}\) s of video duration. The bounds of the range are exactly the buffer thresholds: the reservoir \(r=10\) s and the cushion for the top layer segment \(c_2=50\) s. Therefore the offset \(\phi \) depends on the buffer thresholds. In Fig. 10, we compare BB-BSC, BBA-0 and BBA-1. The results show that BB-BSC outperforms BBA-0 and BBA-1 in terms of video bitrate and bitrate variability. The video bitrate is, respectively, 727.84 Kbps, 650 Kbps and 692.56 Kbps for BB-BSC, BBA-0 and BBA-1. The number of bitrate variations is, respectively, 8, 13 and 10 for BB-BSC, BBA-0 and BBA-1.

The results of the comparison are shown in Table 2 for 50 simulations runs. We compute the following metrics: the average video bitrate (in Kbps), the average number of bitrate variations, the average number of playback interruptions and the variance of the quality. The last metric allows us to know how far is the temporal bitrate from the average bitrate. Since users prefer gradual quality variation, small values of the variance are better for the quality of experience. The buffer based method of BSC outperforms BBA-0 and BBA-1 in terms of video bitrate and bitrate variability with a little risk of video playback interruption, those metrics considered as most important metrics in video quality of experience.

Table 2. Average of QoE metrics: average quality, quality variability, number of switches and number of playback interruption.

The Backward-Shifted Coding system (for both throughput and buffer based methods) outperforms classic DASH algorithms in terms of video quality. But in order to reduce the number of quality variations, we need to adopt a buffer based approach.

5 Conclusion

In this paper, we studied the bitrate adaptation in the Backward-Shifted Coding (BSC) scheme and compared it with DASH based SVC solutions. Since BSC splits the segments into low layer segments and top layers segments and send them independently in two distinct blocks, the main challenge is how to choose the bitrates of those segments given the network capacity which tend to highly fluctuate. Furthermore, with this time redundancy property of BSC, we are able to transmit segments and improve later their quality by sending only the appropriate number of enhancement layers.

We have proposed two bitrate adaptation algorithms, namely TB-BSC and BB-BSC, which have been designed on top of BSC. They are based on network throughput measurements and playback buffer occupancy level, respectively. We show that BSC system (HTTP adaptive video streaming system in general) may suffer from throughput estimation errors, thus, impacting the resulting QoE since we have a high number of quality variations. The limitations of the throughput based methods are overcome with the buffer based methods which set a good tradeoff between the video quality and the quality variations.

We further performed simulations compare the efficiency of BSC adaptation methods to existing DASH based SVC solutions. The results show that BSC adaptation methods achieve better video quality under same network conditions, providing a DASH-compliant solution rendering high quality video in HTTP adaptive streaming.