# Adaptive Modulation with Smoothed Flow Utility

- 731 Downloads
- 1 Citations

## Abstract

We consider the problem of choosing the data flow rate on a wireless link with randomly varying channel gain, to optimally trade off average transmit power and the average utility of the smoothed data flow rate. The smoothing allows us to model the demands of an application that can tolerate variations in flow over a certain time interval; we will see that this smoothing leads to a substantially different optimal data flow rate policy than without smoothing. We pose the problem as a convex stochastic control problem. For the case of a single flow, the optimal data flow rate policy can be numerically computed using stochastic dynamic programming. For the case of multiple flows on a single link, we propose an approximate dynamic programming approach to obtain suboptimal data flow rate policies. We illustrate, through numerical examples, that these approximate policies can perform very well.

## Keywords

Optimal Policy Channel Gain Adaptive Modulation Average Utility Stochastic Dynamic Programming## 1. Introduction

We consider the flow rate assignment problem on a wireless link with randomly varying channel gain, to optimally trade off average transmit power and the average utility of the smoothed flow data rate. We pose the multiperiod problem as an infinite-horizon stochastic control problem with linear dynamics and convex objective. For the case of a single flow, the optimal policy is easily found using stochastic dynamic programming (DP) and gridding. For the case of multiple flows, DP becomes intractable, and we propose instead an approximate dynamic programming approach using suboptimal policies developed in the single-flow case. Simulations show that these suboptimal policies perform very well.

In the wireless communications literature, varying a link's transmit rate (and power) depending on channel conditions is called *adaptive modulation* (AM); see, for example, [1, 2, 3, 4, 5]. One drawback of AM is that it is a physical layer optimization technique with no knowledge of upper layer optimization protocols. Maximizing a total utility function is also very common in various communications and networking problem formulations, where it is referred to as network utility maximization (NUM); see, for example, [6, 7, 8, 9, 10]. In the NUM framework, performance of an upper layer protocol (e.g., TCP) is determined by *utility* of flow attributes, for example, utility of link flow rate.

Our setup involves both adaptive modulation and utility maximization but is nonstandard in several respects. We consider the utility of the smoothed flows, and we consider multiple flows over the same wireless link [11].

## 2. Problem Setup

### 2.1. Average Smoothed Flow Utility

A wireless communication link supports Open image in new window data flows in a channel that varies with time, which we model using discrete-time intervals Open image in new window . We let Open image in new window be the data flow rate vector on the link, where Open image in new window , Open image in new window , is the Open image in new window th flow's data rate at time Open image in new window and Open image in new window denotes the set of nonnegative numbers. We let Open image in new window denote the total flow rate over all flows, where Open image in new window is the vector with all entries one. The flows, and the total flow rate, will depend on the random channel gain (through the flow policy, described below) and so are random variables.

*smoothed*version of the flow rates, which is meant to capture the tolerance of the applications using the data flows to time variations in data rate. This was introduced in [12] using delivery contracts, in which the utility is a function of the total flow over a given time interval; here, we use instead a very simple first-order linear smoothing. At each time Open image in new window , the smoothed data flow rate vector Open image in new window is given by

where at time Open image in new window , each smoothed flow rate Open image in new window is the exponentially weighted average of previous flow rates.

The smoothing parameter Open image in new window determines the level of smoothing on flow Open image in new window . Small smoothing parameter values ( Open image in new window close to zero) correspond to light smoothing; large values ( Open image in new window close to one) correspond to heavy smoothing. (Note that Open image in new window means that flow Open image in new window is not smoothed; we have Open image in new window .) The level of smoothing can be related to the time scale over which the smoothing occurs. We define Open image in new window to be the *smoothing time* associated with flow Open image in new window . Roughly speaking, the smoothing time is the time interval over which the effect of a flow on the smoothed flow decays by a factor Open image in new window . Light smoothing corresponds to short smoothing times, while heavy smoothing corresponds to longer smoothing times.

where Open image in new window . Here, the expectation is over the smoothed flows Open image in new window , and we are assuming that the expectations and limit above exist.

parameterized by Open image in new window and Open image in new window . The parameter Open image in new window sets the curvature (or risk aversion), while Open image in new window sets the overall weight of the utility. (For small values of Open image in new window , Open image in new window approaches a log utility.)

So the time smoothing step does affect our average utility; we will see later that it has a dramatic effect on the optimal flow policy.

### 2.2. Average Power

where Open image in new window is increasing and strictly convex in Open image in new window for each value of Open image in new window ( Open image in new window is the set of positive numbers).

where, again, we are assuming that the expectations and limit exist.

### 2.3. Flow Rate Control Problem

where Open image in new window is used to trade off average utility and power.

where Open image in new window . In other words, the policy depends only on the current smoothed flows and the current channel gain value.

The flow rate control problem is to choose the flow rate policy Open image in new window to maximize the overall objective in (9). This is a standard convex stochastic control problem, with linear dynamics.

### 2.4. Our Results

We let Open image in new window be the optimal overall objective value and let Open image in new window be an optimal policy. We will show that in the general (multiple-flow) case, the optimal policy includes a "no-transmit" zone, that is, a region in the Open image in new window space in which the optimal flow rate is zero. Not surprisingly, the optimal flow policy can be roughly described as waiting until the channel gain is large, or until the smoothed flow has fallen to a low level, at which point we transmit (i.e., choose nonzero Open image in new window ). Roughly speaking, the higher the level of smoothing, the longer we can afford to wait for a large channel gain before transmitting. The average power required to support a given utility level decreases, sometimes dramatically, as the level of smoothing increases.

We show that the optimal policy for the case of a single flow is readily computed numerically, working from Bellman's characterization of the optimal policy, and is not particularly sensitive to the details of the utility functions, smoothing levels, or power functions.

For the case of multiple flows, we cannot easily compute (or even represent) the optimal policy. For this case we propose an approximate policy, based on approximate dynamic programming [18, 19]. By computing an upper bound on Open image in new window , by allowing the flow control policy to use future values of channel gain (i.e., relaxing the causality requirement [20]), we show in numerical experiments that such policies are nearly optimal.

## 3. Optimal Policy Characterization

### 3.1. No Smoothing

which does not depend on Open image in new window . A simple and effective approach is to presolve this problem for a suitably large set of values of the channel gain Open image in new window and store the resulting tables of individual flow rates Open image in new window versus Open image in new window ; online we can interpolate between points in the table to find the (nearly) optimal policy. Another option is to fit a simple function to the optimal flow rate data and use this function as our (nearly) optimal policy.

(Each of these can be expressed in terms of conjugate functions; (see, e.g., [21,Section Open image in new window ].) We then adjust Open image in new window (say, using bisection) so that Open image in new window . An alternative is to carry out bisection on Open image in new window , defining Open image in new window in terms of Open image in new window as above, until Open image in new window , where Open image in new window refers to the derivative with respect to Open image in new window .

where the flow values come from the equation above. (The left-hand side is decreasing in Open image in new window , while the right-hand side is increasing.)

### 3.2. General Case

where the expectation is over Open image in new window . The fixed point equation and Bellman operator are invariant under adding a constant; that is, we have Open image in new window , for any constant (function) Open image in new window , and, similarly, Open image in new window satisfies the fixed point equation if and only if Open image in new window does. So without loss of generality we can assume that Open image in new window .

- (1)
Open image in new window (apply Bellman operator).

- (2)
Open image in new window (estimate optimal value).

- (3)
Open image in new window (normalize).

For technical conditions under which the value function exists and can be obtained via value iteration, see, for example, [27, 28, 29]. We will simply assume here that the value function exists, and Open image in new window and Open image in new window converge to Open image in new window and Open image in new window , respectively.

The iterations above preserve several attributes of the iterates, which we can then conclude holds for Open image in new window . First of all, concavity of Open image in new window is preserved; that is, if Open image in new window is concave, so is Open image in new window . It is clear that normalization does not affect concavity, since we simply add a constant to the function. The Bellman operator Open image in new window preserves concavity since partial maximization of a function concave in two sets of variables results in a concave function (see, i.e*.,*[21, Section Open image in new window ]) and expectation over a family of concave functions yields a concave function; finally, addition (of Open image in new window ) preserves concavity. So we can conclude that Open image in new window is concave.

Another attribute that is preserved in value iteration is monotonicity; if Open image in new window is monotone increasing (in each component of its argument), then so is Open image in new window . We conclude that Open image in new window is monotone increasing.

### 3.3. No-Transmit Region

as the necessary and sufficient condition under which Open image in new window . Since Open image in new window is decreasing (by concavity of Open image in new window ), we can interpret (24) roughly as follows: do not transmit if the channel is bad ( Open image in new window small) *or* if the smoothed flows are large ( Open image in new window large).

## 4. Single-Flow Case

### 4.1. Optimal Policy

In the case of a single flow (i.e., Open image in new window ) we can easily carry out value iteration numerically, by discretizing the argument Open image in new window and values of Open image in new window and computing the expectation and maximization numerically. For the single-flow case, then we can compute the optimal policy and optimal performance (up to small numerical integration errors).

### 4.2. Power Law Suboptimal Policy

where Open image in new window is an approximation of the value function.

where Open image in new window are the discretized values of Open image in new window , with associated value function values Open image in new window . We do this by bisection on Open image in new window .

Experiments show that these power law approximate functions are, in general, reasonable approximations for the value function. For our power law utilities, these approximations yield very good matches to the true value function. For other concave utilities, the approximation is not as accurate, but experiments show that the associated approximate policies still yield nearly optimal performance.

and Open image in new window is the Lambert function; that is, Open image in new window is the solution of Open image in new window [30].

Note that this suboptimal policy is not needed in the single-flow case since we can obtain the optimal policy numerically. However, we found that the difference between our power law policy and the optimal policy (see the example of value functions below) is small enough that in practice they are virtually the same. This approximate policy is needed in the case of multiple flows.

### 4.3. Numerical Example

In this section we give simple numerical examples to illustrate the effect of smoothing on the resulting flow rate policy in the single-flow case. We consider two examples, with different levels of smoothing. The first flow is lightly smoothed ( Open image in new window ; Open image in new window ), while the second flow is heavily smoothed ( Open image in new window ; Open image in new window ). We use utility function Open image in new window , that is, Open image in new window , Open image in new window in our utility (4). The channel gains Open image in new window are IID exponential variables with mean Open image in new window . We use the power function (7), with Open image in new window .

Average Power versus Average Utility.

Comparing Average Power.

Utility Curvature.

Average power required for target Open image in new window , lightly smoothed flow Open image in new window , heavily smoothed flow Open image in new window .

0.032 | 0.013 | ||

0.59 | 0.39 | ||

0.93 | 0.70 | ||

1.15 | 0.97 | ||

1.22 | 1.08 |

## 5. A Suboptimal Policy for the Multiple-Flow Case

### 5.1. Approximate Dynamic Programming (ADP) Policy

A policy obtained by replacing Open image in new window with an approximation is called an approximate dynamic programming (ADP) policy [18, 19, 31]. (Note that by this definition (25) is an ADP policy for Open image in new window .)

This approximate value function is separable, that is, a sum of functions of the individual flows, whereas the exact value function is (in general) not. The approximate policy, however, is not separable; the optimization problem solving to assign flow rates couples the different flow rates.

In the literature on approximate dynamic programming, Open image in new window would be considered basis functions [32, 33, 34]; however, we fix the coefficients of the basis functions as one. (We have found that very little improvement in the policy is obtained by optimizing over the coefficients.)

with optimization variables Open image in new window , Open image in new window . This is a convex optimization problem; its special structure allows it to be solved extremely efficiently, via waterfilling.

### 5.2. Solution via Waterfilling

Since our surrogate value function is only approximate, there is no reason to solve this to great accuracy; experiments show that around 5–10 bisection iterations are more than enough.

Each iteration of the waterfilling algorithm has a cost that is Open image in new window which means that we can solve (31) very fast. An interior point method that exploits the structure would also yield a very efficient method; see, for example, [35].

### 5.3. Upper-Bound Policies

In this section we describe two heuristic data flow rate policies: a steady-state flow policy and a prescient flow policy. We show that both policies result in upper bounds on Open image in new window (the optimal objective value). These upper bounds give us a way to measure the performance of our suboptimal flow policy Open image in new window : if we obtain a Open image in new window from Open image in new window that is close to an upper bound, then we know that our suboptimal flow policy is nearly optimal.

#### 5.3.1. Steady-State Policy

*steady-state*flow rate vector (independent of time) obtained by solving the optimization problem

with optimization variable Open image in new window , and Open image in new window being known. Let Open image in new window be our steady-state upper bound on Open image in new window obtained using the policy (35) to solve (9). Note that in the above optimization problem, we ignore time (and hence, smoothing) and variations in channel gains, and so, for each Open image in new window , Open image in new window is the optimal (steady-state) flow vector. (This is sometimes called the certainty equivalent problem associated with the stochastic programming problem [36, 37].)

By Jensen's inequality (and convexity of the max) it is easy to see that Open image in new window is an upper bound on Open image in new window . Note that once Open image in new window is determined, we can evaluate (35) using the waterfilling algorithm described earlier.

#### 5.3.2. Prescient Policy

where the optimization variables are the flow rates Open image in new window , Open image in new window , Open image in new window and smoothed flow rates Open image in new window , Open image in new window , Open image in new window . (The problem data are Open image in new window and Open image in new window , Open image in new window , Open image in new window .) The optimal value of (37) is a random variable parameterized by Open image in new window . Let Open image in new window denote our prescient upper bound on Open image in new window . We obtain Open image in new window by using Monte Carlo simulation: we take Open image in new window large and solve (37) for independent realizations of the channel gains. The mean is our prescient upper bound.

### 5.4. Numerical Example

In this section we compare the performance of our ADP policy to the above prescient policy using a numerical example.

(Note that this is easily extended to a problem with more than two flows.)

Let Open image in new window denote the objective obtained using our ADP policy. Each Open image in new window obtains an ADP controller, a point Open image in new window in the Open image in new window plane. Using the same Open image in new window , we can compute the corresponding prescient bound giving the point Open image in new window . (Every feasible controller must lie on or below the line, with slope Open image in new window , that passes through Open image in new window .)

We carried out Monte Carlo simulation (100 realizations, each with Open image in new window time steps) for several values of Open image in new window , computing Open image in new window as described in Section 5.2 and our prescient upper bound as described above.

## 6. Conclusion

In this paper we present a variation on a multiperiod stochastic network utility maximization problem as a constrained convex stochastic control problem. We show that judging flow utilities dynamically, that is, with a utility function and a smoothing time scale, is a good way to account for network applications with heterogenous rate demands.

For the case of a single flow, our numerically computed value functions obtain flow policies that optimally trad off average utility and average power. We show that simple power law functions are reasonable approximations of the optimal value functions and that these simple functions obtain near optimal performance.

For the case of multiple flows on a single link (where the value function is not practically computable using dynamic programming), we approximate the value function with a combination of the simple one-dimensional power law functions. Simulations, and comparison with upper bounds on the optimal value, show that the resulting ADP policy can obtain very good performance.

## Notes

### Acknowledgments

This material is based upon work supported by AFOSR Grant FA9550-09-0130 and by Army contract W911NF-07-1-0029. The authors thank Yang Wang and Dan O'Neill for helpful discussions.

## References

- 1.Hayes J: Adaptive feedback communications.
*IEEE Transactions on Communication Technology*1968, 16(1):29-34. 10.1109/TCOM.1968.1089811CrossRefGoogle Scholar - 2.Cavers J: Variable-rate transmission for Rayleigh fading channels.
*IEEE Transactions on Communications*1972, 20(1):15-22. 10.1109/TCOM.1972.1091106MathSciNetCrossRefGoogle Scholar - 3.Hentinen VO: Error performance for adaptive transmission on fading channels.
*IEEE Transactions on Communications*1974, 22(9):1331-1337. 10.1109/TCOM.1974.1092383CrossRefGoogle Scholar - 4.Webb WT, Steele R: Variable rate QAM for mobile radio.
*IEEE Transactions on Communications*1995, 43(7):2223-2230. 10.1109/26.392965CrossRefGoogle Scholar - 5.Soon-Ghee C, Goldsmith AJ: Variable-rate variable-power MQAM for fading channels.
*IEEE Transactions on Communications*1997, 45(10):1218-1230. 10.1109/26.634685CrossRefGoogle Scholar - 6.Kelly FP, Maulloo AK, Tan D: Rate control for communication networks: shadow prices, proportional fairness and stability.
*Journal of the Operational Research Society*1997, 49(3):237-252.CrossRefMATHGoogle Scholar - 7.Low SH, Lapsley DE: Optimization flow control—I: basic algorithm and convergence.
*IEEE/ACM Transactions on Networking*1999, 7(6):861-874. 10.1109/90.811451CrossRefGoogle Scholar - 8.Chiang M, Low SH, Calderbank AR, Doyle JC: Layering as optimization decomposition: a mathematical theory of network architectures.
*Proceedings of the IEEE*2007, 95(1):255-312.CrossRefGoogle Scholar - 9.Neely MJ, Modiano E, Li C-P: Fairness and optimal stochastic control for heterogeneous networks.
*IEEE/ACM Transactions on Networking*2008, 16(2):396-409.CrossRefGoogle Scholar - 10.Chen J, Xu W, He S, Sun Y, Thulasiraman P, Shen X: Utility-based asynchronous flow control algorithm for wireless sensor networks.
*IEEE Journal on Selected Areas in Communications*2010, 28(7):1116-1126.CrossRefGoogle Scholar - 11.O'Neill D, Akuiyibo E, Boyd S, Goldsmith AJ: Optimizing adaptive modulation in wireless networks via multi-period network utility maximization.
*Proceedings of the IEEE International Conference on Communications, 2010*Google Scholar - 12.Trichakis N, Zymnis A, Boyd S: Dynamic network utility maximization with delivery contracts.
*Proceedings of the IFAC World Congress, 2008*2907-2912.Google Scholar - 13.Bertsekas D:
*Dynamic Programming and Optimal Control: Volume 1*. Athena Scientific; 2005.MATHGoogle Scholar - 14.Bertsekas D:
*Dynamic Programming and Optimal Control: Volume 2*. Athena Scientific; 2007.Google Scholar - 15.Åström K:
*Introduction to Stochastic Control Theory*. Dover, New York, NY, USA; 1970.MATHGoogle Scholar - 16.Whittle P:
*Optimization Over Time: Dynamic Programming and Stochastic Control*. John Wiley & Sons, New York, NY, USA; 1982.MATHGoogle Scholar - 17.Bertsekas D, Shreve S:
*Stochastic Optimal Control: The Discrete-Time Case*. Athena Scientific; 1996.MATHGoogle Scholar - 18.Bertsekas D, Tsitsiklis J:
*Neuro-Dynamic Programming*. Athena Scientific; 1996.MATHGoogle Scholar - 19.Powell W:
*Approximate Dynamic Programming: Solving the Curses of Dimensionality*. John Wiley & Sons, New York, NY, USA; 2007.CrossRefMATHGoogle Scholar - 20.Brown DB, Smith JE, Sun P: Information relaxations and duality in stochastic dynamic programs.
*Operations Research*2010, 58(4):785-801. 10.1287/opre.1090.0796MathSciNetCrossRefMATHGoogle Scholar - 21.Boyd S, Vandenberghe L:
*Convex Optimization*. Cambridge University Press, Cambridge, UK; 2004.CrossRefMATHGoogle Scholar - 22.Puterman M:
*Markov Decision Processes: Discrete Stochastic Dynamic Programming*. John Wiley & Sons, New York, NY, USA; 1994.CrossRefMATHGoogle Scholar - 23.Ross S:
*Introduction to Stochastic Dynamic Programming: Probability and Mathematical*. Academic Press; 1983.Google Scholar - 24.Denardo E:
*Dynamic Programming: Models and Applications*. Prentice-Hall, New York, NY, USA; 1982.MATHGoogle Scholar - 25.Wang Y, Boyd S: Performance bounds for linear stochastic control.
*Systems and Control Letters*2009, 58(3):178-182. 10.1016/j.sysconle.2008.10.004MathSciNetCrossRefMATHGoogle Scholar - 26.Bellman R:
*Dynamic Programming*. Courier Dover, New York, NY, USA; 1957.MATHGoogle Scholar - 27.Derman C:
*Finite State Markovian Decision Processes*. Academic Press; 1970.MATHGoogle Scholar - 28.Blackwell D: Discrete dynamic programming.
*The Annals of Mathematical Statistics*1962, 33: 719-726. 10.1214/aoms/1177704593MathSciNetCrossRefMATHGoogle Scholar - 29.Arapostathis A, Borkar V, Fernández-Gaucherand E, Ghosh MK, Marcus SI: Discrete-time controlled Markov processes with average cost criterion: a survey.
*SIAM Journal on Control and Optimization*1993, 31(2):282-344. 10.1137/0331018MathSciNetCrossRefMATHGoogle Scholar - 30.Corless RM, Gonnet GH, Hare DEG, Jeffrey DJ, Knuth DE: On the Lambert W function.
*Advances in Computational Mathematics*1996, 5(4):329-359.MathSciNetCrossRefMATHGoogle Scholar - 31.Manne A: Linear programming and sequential decisions.
*Management Science*1960, 6(3):259-267. 10.1287/mnsc.6.3.259MathSciNetCrossRefMATHGoogle Scholar - 32.Schweitzer PJ, Seidmann A: Generalized polynomial approximations in Markovian decision processes.
*Journal of Mathematical Analysis and Applications*1985, 110(2):568-582. 10.1016/0022-247X(85)90317-8MathSciNetCrossRefMATHGoogle Scholar - 33.Trick MA, Zin SE: Spline approximations to value functions: linear programming approach.
*Macroeconomic Dynamics*1997, 1(1):255-277.CrossRefMATHGoogle Scholar - 34.De Farias DP, Van Roy B: The linear programming approach to approximate dynamic programming.
*Operations Research*2003, 51(6):850-865. 10.1287/opre.51.6.850.24925MathSciNetCrossRefMATHGoogle Scholar - 35.Madan R, Boyd SP, Lall S: Fast algorithms for resource allocation in wireless cellular networks.
*IEEE/ACM Transactions on Networking*2010, 18(3):973-984.CrossRefGoogle Scholar - 36.Birge J, Louveaux F:
*Introduction to Stochastic Programming*. Springer, New York, NY, USA; 1997.MATHGoogle Scholar - 37.Prekopa A:
*Stochastic Programming*. Kluwer Academic Publishers, New York, NY, USA; 1995.CrossRefMATHGoogle Scholar

## Copyright information

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.