Adaptive Linear Prediction and Process Order Identification

  • T. C. Butash
  • L. D. Davisson
Part of the International Centre for Mechanical Sciences book series (CISM, volume 324)


Adaptive linear predictors are employed to provide solutions to problems ranging from adaptive source coding to autoregressive (AR) spectral estimation. In such applications, an adaptive linear predictor is realized by a linear combination of a finite number, M, of the observations immediately preceding each sample to be predicted, where the coefficients defining the predictor are “adapted to”, or estimated on the basis of, the preceding N + M observations in an attempt to continually optimize the predictor’s performance. This performance is thus inevitably dictated by the predictor's order, M, and the length of its learning period, N.

We formulate the adaptive linear predictor’s MSE performance in a series of theorems, with and without the Gaussian assumption, under the hypotheses that its coefficients are derived from either the (single) observation sequence to be predicted (dependent case), or a second, statistically independent realization (independent case). The established theory on adaptive linear predictor performance and order selection is reviewed, including the works of Davisson (Gaussian, dependent case), and Akaike (AR, independent case). Results predicated upon the independent case hypothesis (e.g., Akaike’s FPE procedure) are shown to incur substantial error under the dependent case conditions prevalent in typical adaptive prediction environments. Similarly, theories based on the Gaussian assumption are found to suffer a loss in accuracy which is proportional to the deviation, of the probability law governing the process in question, from the normal distribution.

We develop a theory on the performance of, and an optimal order selection criterion for, an adaptive linear predictor which is applicable under the dependent case, in environments where the Gaussian assumption is not necessarily justifiable.


Less Mean Square Independent Case Linear Minimum Mean Square Error Gaussian Assumption Dependent Case 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    L. D. Davisson, “Theory of Adaptive Data Compression,” Ph.D. Dissertation, University of California at Los Angeles, 1964.Google Scholar
  2. [2]
    J. Makhoul, “Linear Prediction: A Tutorial Review,” Proc. IEEE, vol. 63, no. 4, pp. 561–580, April 1975.CrossRefGoogle Scholar
  3. [3]
    B. Widrow et al., “Adaptive Noise Cancelling: Principles and Applications,” Proc. IEEE, vol. 63, no. 12, pp. 1692–1716, Dec. 1975.CrossRefGoogle Scholar
  4. [4]
    F. W. Symons, “Narrow-Band Interference Rejection Using The Complex Linear Prediction Filter,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-26, pp. 94–98, Feb. 1978.CrossRefMATHGoogle Scholar
  5. [5]
    S. M. Kay and S. L. Marple, Jr., “Spectrum Analysis- A Modern Perspective,” Proc. IEEE, vol. 69, no. 11, pp.1380–1419, Nov. 1981.CrossRefGoogle Scholar
  6. [6]
    L. B. Milstein, “Recent Developments In Interference Suppression Techniques In Spread Spectrum Communications,” Proc. IEEE 1988 Annual Workshop on Information Theory, pp. 6–16, April 1988.Google Scholar
  7. [7]
    L. D. Davisson, “The Prediction Error of Stationary Gaussian Time Series of Unknown Covariance,” IEEE Trans. Inform. Theory, vol. IT-19, no. 4, pp. 527–532, Oct. 1965.MathSciNetCrossRefGoogle Scholar
  8. [8]
    L. D. Davisson, “The Adaptive Prediction of Time Series,” in Proc. Nat. Electronics Conf, vol. 22, pp. 557–561, 1966.Google Scholar
  9. [9]
    H. Akaike, “Fitting Autoregressive Models For Prediction,” Ann. Inst. Statist. Math., vol. 21, no. 2, pp. 243–247, 1969.MathSciNetCrossRefMATHGoogle Scholar
  10. [10]
    H. Akaike, “Statistical Predictor Identification,” Ann. Inst. Statist. Math., vol. 22, no. 2, pp. 203–217, 1970.MathSciNetCrossRefMATHGoogle Scholar
  11. [11]
    H. Akaike, “Information Theory and An Extension of the Maximum Likelihood Principle,” in Proc. 2nd Int. Symp. Information Theory, pp. 267–281, 1972.Google Scholar
  12. [12]
    H. Akaike, “Use of An Information Theoretic Quantity for Statistical Model Identification,” in Proc. 5th Hawaii Int. Conf. System Sciences, pp. 249–250, 1972.Google Scholar
  13. [13]
    H. Akaike, “A New Look At the Statistical Model Identification,” IEEE Trans. Automat. Contr., vol. AC-19, no. 6, pp. 716–723, Dec. 1974.MathSciNetCrossRefMATHGoogle Scholar
  14. [14]
    H. Akaike, “A Bayesian Analysis of the Minimum AIC Procedure,” Ann. Inst. Statist. Math., vol. 30, part A, pp. 9–14, 1978.MathSciNetCrossRefMATHGoogle Scholar
  15. [15]
    E. Parzen, “Some Recent Advances in Time Series Modeling,” IEEE Trans. Automat. Contr., vol. AC-19, no. 6, pp. 723–730, Dec. 1974.MathSciNetCrossRefMATHGoogle Scholar
  16. [16]
    E. Parzen, “Multiple Time Series: Determining the Order of Approximating Autoregressive Schemes,” Multivariate Analysis — TV, ed. by P. Krishnaiah, North Holland: Amsterdam, pp. 283–295, 1977.Google Scholar
  17. [17]
    R. H. Jones, “Identification and Autoregressive Spectrum Estimation,” IEEE Trans. Automat. Contr., vol. AC-19, no. 6, pp. 894–898, Dec. 1974.CrossRefGoogle Scholar
  18. [18]
    R. H. Jones, “Autoregression Order Selection,” Geophys., vol. 41, pp. 771–773, Aug. 1976.CrossRefGoogle Scholar
  19. [19]
    G. Schwartz, “Estimating the Dimension of a Model,” Ann. Statist., vol. 6, pp. 461–464, 1978.MathSciNetCrossRefGoogle Scholar
  20. [20]
    J. Rissanen, “Modeling by Shortest Data Description,” Automatica, vol. 14, pp. 465–471, 1978.CrossRefMATHGoogle Scholar
  21. [21]
    J. Rissanen, “A Universal Prior for Integers and Estimation by Minimum Description Length,” Ann. Statist., vol. 11, no. 2, pp. 416–431, June 1983.MathSciNetCrossRefMATHGoogle Scholar
  22. [22]
    J. Rissanen, “Universal Coding, Information, Prediction, and Estimation,” IEEE Trans. Inform. Theory, vol. IT-30, no. 4, pp. 629–636, July 1984.MathSciNetCrossRefMATHGoogle Scholar
  23. [23]
    E. J. Hannan and B. G. Quinn, “The Determination of the Order of an Autoregression,” Jour. Roy. Statist. Soc, Ser. B, vol. 41, no. 2, pp. 190–195, 1979.MathSciNetMATHGoogle Scholar
  24. [24]
    E. J. Hannan, “The Estimation of the Order of an ARMA Process,” Ann. Statist., vol. 8, no. 5, pp. 1071–1081, 1980.MathSciNetCrossRefMATHGoogle Scholar
  25. [25]
    W. A. Fuller and D. P. Hasza, “Properties of Predictors for Autoregressive Time Series,” Jour. Am. Statist. Assoc, vol. 76, no. 373, pp. 155–161, Mar. 1981.MathSciNetCrossRefMATHGoogle Scholar
  26. [26]
    R. J. Bhansali and D. Y. Downham, “Some Properties of an Autoregressive Model Selected by a Generalization of Akaike’s FPE Criterion,” Biometrika, vol. 64, no. 3, pp. 547–551, 1977.MathSciNetMATHGoogle Scholar
  27. [27]
    R. J. Bhansali, “Effects of Not Knowing the Order of an Autoregressive Process on the Mean Squared Error of Prediction — I,” Jour. Am. Statist. Assoc, vol. 76, no. 375, pp. 588–597, Sept. 1981.MathSciNetMATHGoogle Scholar
  28. [28]
    R. R. Bitmead, “Convergence in Distribution of LMS-Type Adaptive Parameter Estimates,” IEEE Trans. Automat. Contr., vol. AC-28, no. 1, Jan. 1983.MathSciNetCrossRefGoogle Scholar
  29. [29]
    R. R. Bitmead, “Convergence Properties of LMS Adaptive Estimators with Unbounded Dependent Inputs,” IEEE Trans. Automat. Contr., vol. AC-29, no. 5, May 1984.MathSciNetGoogle Scholar
  30. [30]
    L. Gyorfl, “Adaptive Linear Procedures Under General Conditions,” IEEE Trans. Inform. Theory, vol. IT-30, no. 2, pp. 262–267, Mar. 1984.MathSciNetCrossRefGoogle Scholar
  31. [31]
    N. Kunitomo and T. Yamamoto, “Properties of Predictors in Misspecifled Autoregressive Time Series Models,” Jour. Am. Statist. Assoc, vol. 80, no. 392, pp. 941–950, Dec. 1985.MathSciNetCrossRefMATHGoogle Scholar
  32. [32]
    T. L. Lai and C. Z. Wei, “Extended Least Squares and Their Applications to Adaptive Control and Prediction in Linear Systems,” IEEE Trans. Automat. Contr., vol. AC-31, no. 10, pp. 898–906, Oct. 1986.MathSciNetCrossRefMATHGoogle Scholar
  33. [33]
    T. C. Butash and L. D. Davisson, “An Overview of Adaptive Linear Minimum Mean Square Error Predictor Performance,” in Proc. 25th IEEE Conf. Decision and Control, pp. 1472–1476, Dec. 1986.Google Scholar
  34. [34]
    C. Z. Wei, “Adaptive Prediction by Least Squares Predictors in Stochastic Regression Models With Applications to Time Series,” The Ann. of Statistics, vol. 15, no. 4, pp. 1667–1682, 1987.CrossRefMATHGoogle Scholar
  35. [35]
    M. Wax, “Order Selection for AR Models by Predictive Least Squares,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, no. 4, pp. 581–588, April 1988.CrossRefMATHGoogle Scholar
  36. [36]
    L. D. Davisson and T. C. Butash, “Adaptive Linear Prediction and Process Order Identification,” in Proc. IEEE 1988 Annual Workshop on Information Theory, pp. 20–32, April 1988.Google Scholar
  37. [37]
    T. C. Butash and L. D. Davisson, “On The Design and Performance of Adaptive LMMSE Predictors,” in Proc. 1988 IEEE Int. Symp. Information Theory, June 1988.Google Scholar
  38. [38]
    A. Krieger and E. Masry, “Convergence Analysis of Adaptive Linear Estimation for Dependent Stationary Processes,” IEEE Trans. Inform. Theory, vol. IT-34, no. 4, pp. 642–654, July 1988.MathSciNetCrossRefMATHGoogle Scholar
  39. [39]
    E. J. Hannan, “Rational Transfer Function Approximation,” Stat. Science, vol. 2, no. 2, pp. 135–161, 1987.MathSciNetCrossRefGoogle Scholar
  40. [40]
    J. L. Doob, Stochastic Processes, John Wiley, New York, 1952.Google Scholar
  41. [41]
    P. H. Diananda, “Some Probability Limit Theorems With Statistical Applications,” Proc. Cambridge Philos. Soc, vol. 49, pp. 239–246, Oct. 1952.MathSciNetCrossRefGoogle Scholar
  42. [42]
    H. Crame’r, Mathematical Methods of Statistics, Princeton University Press, Princeton, New Jersey, 1946.Google Scholar
  43. [43]
    M. Rosenblatt, “A Central Limit Theorem and A Strong Mixing Condition,” Proc. Nat. Acad. Sci., vol. 42, pp. 43–47, 1956.MathSciNetCrossRefMATHGoogle Scholar
  44. [44]
    P. Hall and C.C. Heyde, Martingale Limit Theory and its Application, Academic Press, New York, 1980.MATHGoogle Scholar
  45. [45]
    R. M. Gray and L. D. Davisson, Random Processes: A Mathematical Approach for Engineers, Prentice-Hall, Englewood Cliffs, New Jersey, 1986.Google Scholar
  46. [46]
    P. Billingsley, Convergence of Probability Measures, John Wiley & Sons, New York, 1968.MATHGoogle Scholar
  47. [47]
    M. Iosifescu and R. Theodorescu, Random Processes and Learning, Springer-Verlag, New York, 1969.CrossRefMATHGoogle Scholar
  48. [48]
    A. N. Kolmogorov and Y. A. Rozanov, “On Strong Mixing Conditions for Stationary Gaussian Processes,” Theory Prob. Appl., vol. 5, pp. 204–208, 1960.MathSciNetCrossRefGoogle Scholar
  49. [49]
    Y. A. Rozanov, Stationary Random Processes, Holden-Day, San Francisco, California, 1967.MATHGoogle Scholar
  50. [50]
    I. A. Ibragimov and Yu. V. Linnik, Independent and Stationary Sequences of Random Variables, Wolters-Noordhoff, Groningen Netherland, 1971.MATHGoogle Scholar
  51. [51]
    M. Rosenblatt, Random Processes, Springer-Verlag, New York, 1974.CrossRefMATHGoogle Scholar
  52. [52]
    T. C. Butash, “Adaptive Linear Prediction and Process Order Identification,” Ph.D. Dissertation, University of Maryland at College Park, 1990.Google Scholar

Copyright information

© Springer-Verlag Wien 1991

Authors and Affiliations

  • T. C. Butash
    • 1
  • L. D. Davisson
    • 1
  1. 1.University of MarylandCollege ParkUSA

Personalised recommendations