Learning and Optimization with Perturbation Analysis

  • Xi-Ren Cao


As shown in Chapter 2, performance derivatives for Markov systems depend heavily on performance potentials. In this chapter, we first discuss the numerical methods and sample-path-based algorithms for estimating performance potentials, and we then derive the sample-path-based algorithms for estimating performance derivatives. In performance optimization, the process of estimating the potentials and performance derivatives from a sample path is called learning.


Markov Chain Service Time Sample Path Perturbation Analysis Stochastic Approximation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 31.
    P. Bratley, B. L. Fox, and L. E. Schrage, A Guide to Simulation, Second Edition, Springer-Verlag, New York, 1987.Google Scholar
  2. 33.
    P. Bremaud, “Maximal Coupling and Rare Perturbation Sensitivity Analysis,” Queueing Systems: Theory and Applications, Vol. 11, 307-333, 1992.MATHCrossRefMathSciNetGoogle Scholar
  3. 91.
    L. Dai, “Rate of Convergence for Derivative Estimation of Discrete-Time Markov Chains Via Finite-Difference Approximations with Common Random Numbers,” SIAM Journal on Applied Mathematics, Vol. 57, 731-751, 1997.MATHCrossRefMathSciNetGoogle Scholar
  4. 92.
    L. Dai, “Perturbation Analysis via Coupling,” IEEE Transactions on Automatic Control, Vol. 45, 614-628, 2000.MATHCrossRefGoogle Scholar
  5. 115.
    P. W. Glynn, “Regenerative Structure of Markov Chains Simulated Via Common Random Numbers,” Operations Research Letters, Vol. 4, 49-53, 1985.MATHCrossRefMathSciNetGoogle Scholar
  6. 127.
    P. Heidelberger and D. L. Iglehart, “Comparing Stochastic Systems Using Regenerative Simulation with Common Random Numbers,” Advances in Applied Probability, Vol. 11, 804-819, 1979.CrossRefMathSciNetGoogle Scholar
  7. 177.
    P. L’Ecuyer, “Convergence Rate for Steady-State Derivative Estimators,” Annals of Operations Research, Vol. 39, 121-136, 1992.MATHCrossRefMathSciNetGoogle Scholar
  8. 179.
    P. L’Ecuyer and G. Perron, “On the Convergence Rates of IPA and FDC Derivative Estimators,” Operations Research, Vol. 42, 643-656, 1994.MATHCrossRefMathSciNetGoogle Scholar
  9. 212.
    G. Pflug, Optimization of Stochastic Models: The Interface between Simulation and Optimization, Kluwer Academic Publishers, Boston, Massachusetts, 1996.MATHGoogle Scholar
  10. 213.
    G. Pflug and X. R. Cao, unpublished manuscript.Google Scholar
  11. 197.
    P. Marbach and T. N. Tsitsiklis, “Simulation-Based Optimization of Markov Reward Processes,” IEEE Transactions on Automatic Control, Vol. 46, 191-209, 2001.MATHCrossRefMathSciNetGoogle Scholar
  12. 17.
    J. Baxter and P. L. Bartlett, “Infinite-Horizon Policy-Gradient Estimation,” Journal of Artificial Intelligence Research, Vol. 15, 319-350, 2001.MATHCrossRefMathSciNetGoogle Scholar
  13. 18.
    J. Baxter, P. L. Bartlett, and L. Weaver, “Experiments with Infinite-Horizon, Policy-Gradient Estimation,” Journal of Artificial Intelligence Research, Vol. 15, 351-381, 2001.MATHMathSciNetGoogle Scholar
  14. 23.
    D. P. Bertsekas, Nonlinear Programming, Athena Scientific, Belmont, Massachusetts, 1995.Google Scholar
  15. 10.
    M. Baglietto, F. Davoli, M. Marchese, and M. Mongelli, “Neural Approximation of Open-Loop Feedback Rate Control in Satellite Networks,” IEEE Transactions on Neural Networks, Vol. 16, 1195-1211, 2005.CrossRefGoogle Scholar
  16. 35.
    P. Bremaud, R. P. Malhame, and L. Massoulie, “A Manufacturing System with General Stationary Failure Process: Stability and IPA of Hedging Control Policies,” IEEE Transactions on Automatic Control, Vol. 42, 155-170, 1997.MATHCrossRefMathSciNetGoogle Scholar
  17. 38.
    C. A. Brooks and P. Varaiya, “Using Augmented Infinitesimal Perturbation Analysis for Capacity Planning in Intree ATM Networks,” Discrete Event Dynamic Systems: Theory and Applications, Vol. 7, 377-390, 1997.MATHCrossRefGoogle Scholar
  18. 74.
    C. G. Cassandras, G. Sun, C. G. Panayiotou, and Y. Wardi, “Perturbation Analysis and Control of Two-Class Stochastic Fluid Models for Communication Networks,” IEEE Transactions on Automatic Control, Vol. 48, 770-782, 2003.CrossRefMathSciNetGoogle Scholar
  19. 95.
    F. Davoli, M. Marchese, and M. Mongelli, “Resource Allocation in Satellite Networks: Certainty Equivalent Approaches Versus Sensitivity Estimation Algorithms,” International Journal of Communication Systems, Vol. 18, 3-36, 2005.CrossRefGoogle Scholar
  20. 144.
    Y. C. Ho, M. A. Eyler, and T. T. Chien, “A Gradient Technique for General Buffer Storage Design in A Production Line,” International Journal of Production Research, Vol. 17, 557-580, 1979.CrossRefGoogle Scholar
  21. 145.
    Y. C. Ho, M. A. Eyler, and T. T. Chien, “A New Approach to Determine Parameter Sensitivities of Transfer Lines,” Management Science, Vol. 29, 700-714, 1983.CrossRefGoogle Scholar
  22. 158.
    J. Q. Hu, S. Nananukul, and W. B. Gong, “A New Approach to (s, S) Inventory Systems,” Journal of Applied Probability, Vol. 30, 898-912, 1993.MATHCrossRefMathSciNetGoogle Scholar
  23. 164.
    R. Kapuscinski, and S. Tayur, “A Capacitated Production-inventory Model with Periodic Demand,” Operations Research, Vol. 46, 899-911, 1998.MATHCrossRefMathSciNetGoogle Scholar
  24. 180.
    D. C. Lee, “Applying Perturbation Analysis to Traffic Shaping,” Computer Communications, Vol. 24, 798-810, 2001.CrossRefGoogle Scholar
  25. 186.
    G. Liberopoulos and M. Caramanis, “Infinitesimal Perturbation Analysis for Second Derivative Estimation and Design of Manufacturing Flow Controllers,” Journal of Optimization Theory and Applications, Vol. 81, 297-327, 1994.MATHCrossRefMathSciNetGoogle Scholar
  26. 187.
    G. Liberopoulos and M. Caramanis, “Dynamics and Design of A Class of Parameterized Manufacturing Flow Controllers,” IEEE Transactions on Automatic Control, Vol. 40, 1018-1028, 1995.MATHCrossRefMathSciNetGoogle Scholar
  27. 196.
    N. B. Mandayam and B. Aazhang, “Gradient Estimation for Sensitivity Analysis and Adaptive Multiuser Interference Rejection in Code-Division Multiple-Access Systems,” IEEE Transactions on Communications, Vol. 45, 848-858, 1997.MATHCrossRefGoogle Scholar
  28. 199.
    M. Marchese, A. Garibbo, F. Davoli, and M. Mongelli, “Equivalent Bandwidth Control for the Mapping of Quality of Service in Heterogeneous Networks,” IEEE International Conference on Communications, Vol. 4, 1948-1952, 2004.Google Scholar
  29. 200.
    M. Marchese and M. Mongelli, “On-Line Bandwidth Control for Quality of Service Mapping over Satellite Independent Service Access Points,” Computer Networks, Vol. 50, 2088-2111, 2006.MATHCrossRefGoogle Scholar
  30. 204.
    N. Miyoshi, “Application of IPA to the Sensitivity Analysis of the Leaky-Bucket Filter with Stationary Gradual Input,” Probability in the Engineering and Informational Sciences, Vol. 14, 219-241, 2000.MATHCrossRefMathSciNetGoogle Scholar
  31. 211.
    C. Panayiotou, C. G. Cassandras, G. Sun, and Y. Wardi, “Control of Communication Networks Using Infinitesimal Perturbation Analysis of Stochastic Fluid Models,” Advances in Communication Control Networks, Lecture Notes in Control and Information Sciences, Vol. 308, 1-26, 2004.MathSciNetGoogle Scholar
  32. 210.
    C. Panayiotou and C. G. Cassandras, “Infinitesimal Perturbation Analysis and Optimization for Make-to-Stock Manufacturing Systems Based on Stochastic Fluid Models,” Discrete Event Dynamic Systems: Theory and Applications, Vol. 16, 109-142, 2006.MATHCrossRefMathSciNetGoogle Scholar
  33. 215.
    J. M. Proth, N. Sauer, Y. Wardi, and X. L. Xie, “Marking Optimization of Stochastic Timed Event Graphs Using IPA,” Discrete Event Dynamic Systems: Theory and Applications, Vol. 6, 221-239, 1996.MATHCrossRefGoogle Scholar
  34. 224.
    H. Salehfar and S. Trihadi, “Application of Perturbation Analysis to Sensitivity Computations of Generating Units and System Reliability,” IEEE Transactions on Power Systems, Vol. 13, 152-158, 1998.CrossRefGoogle Scholar
  35. 225.
    U. Savagaonkar, E. K. P. Chong, and R. L. Givan, “Online Pricing for Bandwidth Provisioning in Multi-class Networks,” Computer Networks, Vol. 44, 835-853, 2004.CrossRefGoogle Scholar
  36. 223.
    G. A. Rummery and M. Niranjan, “On-Line Q-Learning Using Connectionist Systems,” Technical Report CUED/F-INFENG/TR 166, Engineering Department, Cambridge University, 1994.Google Scholar
  37. 240.
    Q. Y. Tang and E. K. Boukas, “Adaptive Control for Manufacturing Systems Using Infinitesimal Perturbation Analysis,” IEEE Transactions on Automatic Control, Vol. 44, 1719-1725, 1999.MATHCrossRefMathSciNetGoogle Scholar
  38. 258.
    A. C. Williams and R. A. Bhandiwad, “A Generating Function Approach to Queueing Network Analysis of Multiprogrammed Computers,” Networks, Vol. 6, 1-22, 1976.MATHCrossRefMathSciNetGoogle Scholar
  39. 261.
    H. Yu, and C. G. Cassandras, “Perturbation Analysis for Production Control and Optimization of Manufacturing Systems,” Automatica, Vol. 40, 945-956, 2004.MATHCrossRefMathSciNetGoogle Scholar
  40. 263.
    H. Yu and C. G. Cassandras, “Perturbation Analysis of Communication Networks with Feedback Control Using Stochastic Hybrid Models,” Nonlinear Analysis - Theory Methods and Applications, Vol. 65, 1251-1280, 2006.MATHCrossRefMathSciNetGoogle Scholar
  41. 168.
    J. G. Kemeny and J. L. Snell, “Potentials for Denumerable Markov Chains,” Journal of Mathematical Analysis and Applications, Vol. 3, 196-260, 1960.CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  1. 1.Hong Kong University of Science and TechnologyKowloonHong Kong

Personalised recommendations