Encyclopedia of Systems and Control

Living Edition
| Editors: John Baillieul, Tariq Samad

System Identification Techniques: Convexification, Regularization, and Relaxation

Living reference work entry
DOI: https://doi.org/10.1007/978-1-4471-5102-9_101-1


System identification has been developed, by and large, following the classical parametric approach. In this entry we discuss how regularization theory can be employed to tackle the system identification problem from a nonparametric (or semi-parametric) point of view. Both regularization for smoothness and regularization for sparseness are discussed, as flexible means to face the bias/variance dilemma and to perform model selection. These techniques have also advantages from the computational point of view, leading sometimes to convex optimization problems.


Nonparametric methods Sparsity Kernel methods Sparse Bayesian learning Optimization 
This is a preview of subscription content, log in to check access.


  1. Aravkin A, Burke J, Chiuso A, Pillonetto G (2014) Convex vs non-convex estimators for regression and sparse estimation: the mean squared error properties of ARD and GLASSO. J Mach Learn Res 15:217–252Google Scholar
  2. Bach F, Lanckriet G, Jordan M (2004) Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceedings of the 21st international conference on machine learning, Banff, pp 41–48Google Scholar
  3. Banbura M, Giannone D, Reichlin L (2010) Large Bayesian VARs. J Appl Econom 25:71–92CrossRefMathSciNetGoogle Scholar
  4. Chen T, Ohlsson H, Ljung L (2012) On the estimation of transfer functions, regularizations and Gaussian processes – revisited. Automatica 48, pp 1525–1535CrossRefMATHMathSciNetGoogle Scholar
  5. Chiuso A, Pillonetto G (2012) A Bayesian approach to sparse dynamic network identification. Automatica 48:1553–1565CrossRefMATHMathSciNetGoogle Scholar
  6. Doan T, Litterman R, Sims C (1984) Forecasting and conditional projection using realistic prior distributions. Econom Rev 3:1–100CrossRefMATHGoogle Scholar
  7. Donoho D (2006) Compressed sensing. IEEE Trans Inf Theory 52:1289–1306CrossRefMathSciNetGoogle Scholar
  8. Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360CrossRefMATHMathSciNetGoogle Scholar
  9. Fazel M, Hindi H, Boyd S (2001) A rank minimization heuristic with application to minimum order system approximation. In: Proceedings of the 2001 American control conference, Arlington, vol 6, pp 4734–4739Google Scholar
  10. Hocking RR (1976) A biometrics invited paper. The analysis and selection of variables in linear regression. Biometrics 32:1–49MATHMathSciNetGoogle Scholar
  11. Kitagawa G, Gersh H (1984) A smothness priors-state space modeling of time series with trends and seasonalities. J Am Stat Assoc 79:378–389Google Scholar
  12. Leeb H, Pötscher B (2005) Model selection and inference: facts and fiction. Econom Theory 21:21–59CrossRefMATHGoogle Scholar
  13. Ljung L (1999) System identification – theory for the user. Prentice Hall, Upper Saddle RiverGoogle Scholar
  14. Mackay D (1994) Bayesian non-linear modelling for the prediction competition. ASHRAE Trans 100:3704–3716Google Scholar
  15. Ohlsson H, Ljung L (2013) Identification of switched linear regression models using sum-of-norms regularization. Automatica 49:1045–1050CrossRefMathSciNetGoogle Scholar
  16. Ozay N, Sznaier M, Lagoa C, Camps O (2012) A sparsification approach to set membership identification of switched affine systems. IEEE Trans Autom Control 57:634–648CrossRefMathSciNetGoogle Scholar
  17. Pillonetto G, Chiuso A, De Nicolao G (2011) Prediction error identification of linear systems: a nonparametric Gaussian regression approach. Automatica 47:291–305CrossRefMATHGoogle Scholar
  18. Pillonetto G, De Nicolao G (2010) A new kernel-based approach for linear system identification. Autonatica 46:81–93CrossRefMATHGoogle Scholar
  19. Rasmussen C, Williams C (2006) Gaussian processes for machine learning. MIT, CambridgeMATHGoogle Scholar
  20. Tibshirani R (1996) Regression shrinkage and selection via the LASSO. J R Stat Soc Ser B 58:267–288MATHMathSciNetGoogle Scholar
  21. Tipping M (2001) Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 1:211–244MATHMathSciNetGoogle Scholar
  22. Wang H, Li G, Tsai C (2007) Regression coefficient and autoregressive order shrinkage and selection via the LASSO. J R Stat Soc Ser B 69:63–78MathSciNetGoogle Scholar
  23. Wipf D, Rao B, Nagarajan S (2011) Latent variable Bayesian models for promoting sparsity. IEEE Trans Inf Theory 57:6236–6255CrossRefMathSciNetGoogle Scholar
  24. Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B 68:49–67CrossRefMATHMathSciNetGoogle Scholar
  25. Zou H (2006) The adaptive Lasso and it oracle properties. J Am Stat Assoc 101:1418–1429CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag London 2014

Authors and Affiliations

  1. 1.Department of Information EngineeringUniversity of PadovaPadovaItaly