Encyclopedia of Systems and Control

Living Edition
| Editors: John Baillieul, Tariq Samad

System Identification Techniques: Convexification, Regularization, Relaxation

  • Alessandro ChiusoEmail author
Living reference work entry

Latest version View entry history

DOI: https://doi.org/10.1007/978-1-4471-5102-9_101-3


System identification has been developed, by and large, following the classical parametric approach. In this entry we discuss how regularization theory can be employed to tackle the system identification problem from a nonparametric (or semi-parametric) point of view. Both regularization for smoothness and regularization for sparseness are discussed, as flexible means to face the bias/variance dilemma and to perform model selection. These techniques have also advantages from the computational point of view, leading sometimes to convex optimization problems.


Kernel methods Nonparametric methods Optimization Sparse Bayesian learning Sparsity 
This is a preview of subscription content, log in to check access.


  1. Aravkin A, Burke J, Chiuso A, Pillonetto G (2014) Convex vs non-convex estimators for regression and sparse estimation: the mean squared error properties of ARD and GLASSO. J Mach Learn Res 15:217–252MathSciNetzbMATHGoogle Scholar
  2. Bach F, Lanckriet G, Jordan M (2004) Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceedings of the 21st international conference on machine learning, Banff, pp 41–48Google Scholar
  3. Banbura M, Giannone D, Reichlin L (2010) Large Bayesian VARs. J Appl Econom 25:71–92CrossRefGoogle Scholar
  4. Chen T, Ohlsson H, Ljung L (2012) On the estimation of transfer functions, regularizations and Gaussian processes – revisited. Automatica 48:1525–1535MathSciNetCrossRefGoogle Scholar
  5. Chiuso A (2016) Regularization and Bayesian learning in dynamical systems: past, present and future. Annu Rev Control 41:24–38CrossRefGoogle Scholar
  6. Chiuso A, Pillonetto G (2012) A Bayesian approach to sparse dynamic network identification. Automatica 48:1553–1565MathSciNetCrossRefGoogle Scholar
  7. Daniel M, Robert JP, Thomas S, Claudio M, Dario F, Gustavo S (2010) Revealing strengths and weaknesses of methods for gene network inference. Proc Natl Acad Sci 107:6286–6291CrossRefGoogle Scholar
  8. Dankers A, Van den Hof PMJ, Bombois X, Heuberger PSC (2016) Identification of dynamic models in complex networks with prediction error methods: predictor input selection. IEEE Trans Autom Control 61:937–952MathSciNetCrossRefGoogle Scholar
  9. Doan T, Litterman R, Sims C (1984) Forecasting and conditional projection using realistic prior distributions. Econom Rev 3:1–100CrossRefGoogle Scholar
  10. Donoho D (2006) Compressed sensing. IEEE Trans Inf Theory 52:1289–1306MathSciNetCrossRefGoogle Scholar
  11. Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360MathSciNetCrossRefGoogle Scholar
  12. Fazel M, Hindi H, Boyd S (2001) A rank minimization heuristic with application to minimum order system approximation. In: Proceedings of the 2001 American control conference, Arlington, vol 6, pp 4734–4739Google Scholar
  13. Formentin S, Chiuso A (2018) CoRe: control-oriented regularization for system identification. In: 2018 IEEE conference on decision and control (CDC), pp 2253–2258Google Scholar
  14. Hayden D, Chang YH, Goncalves J, Tomlin CJ (2016) Sparse network identifiability via compressed sensing. Automatica 68:9–17MathSciNetCrossRefGoogle Scholar
  15. Hocking RR (1976) A biometrics invited paper. The analysis and selection of variables in linear regression. Biometrics 32:1–49Google Scholar
  16. Kitagawa G, Gersh H (1984) A smoothness priors-state space modeling of time series with trends and seasonalities. J Am Stat Assoc 79:378–389Google Scholar
  17. Leeb H, Pötscher B (2005) Model selection and inference: facts and fiction. Econom Theory 21:21–59MathSciNetCrossRefGoogle Scholar
  18. Ljung L (1999) System identification – theory for the user. Prentice Hall, Upper Saddle RiverzbMATHGoogle Scholar
  19. Mackay D (1994) Bayesian non-linear modelling for the prediction competition. ASHRAE Trans 100:3704–3716Google Scholar
  20. Ohlsson H, Ljung L (2013) Identification of switched linear regression models using sum-of-norms regularization. Automatica 49:1045–1050MathSciNetCrossRefGoogle Scholar
  21. Ozay N, Sznaier M, Lagoa C, Camps O (2012) A sparsification approach to set membership identification of switched affine systems. IEEE Trans Autom Control 57:634–648MathSciNetCrossRefGoogle Scholar
  22. Pillonetto G, Chiuso A (2015) Tuning complexity in regularized kernel-based regression and linear system identification: the robustness of the marginal likelihood estimator. Automatica 58:106–117MathSciNetCrossRefGoogle Scholar
  23. Pillonetto G, De Nicolao G (2010) A new kernel-based approach for linear system identification. Automatica 46:81–93MathSciNetCrossRefGoogle Scholar
  24. Pillonetto G, Chiuso A, De Nicolao G (2011) Prediction error identification of linear systems: a nonparametric Gaussian regression approach. Automatica 47:291–305MathSciNetCrossRefGoogle Scholar
  25. Pillonetto G, Chen T, Chiuso A, Nicolao GD, Ljung L (2016) Regularized linear system identification using atomic, nuclear and kernel-based norms: the role of the stability constraint. Automatica 69:137–149MathSciNetCrossRefGoogle Scholar
  26. Prando G, Chiuso A, Pillonetto G (2017a) Maximum entropy vector kernels for MIMO system identification. Automatica 79:326–339MathSciNetCrossRefGoogle Scholar
  27. Prando G, Zorzi M, Bertoldo A, Chiuso A (2017b) Estimating effective connectivity in linear brain network models. In: 2017 IEEE 56th annual conference on decision and control (CDC), pp 5931–5936Google Scholar
  28. Rasmussen C, Williams C (2006) Gaussian processes for machine learning. MIT, CambridgezbMATHGoogle Scholar
  29. Razi A, Seghier ML, Zhou Y, McColgan P, Zeidman P, Park H-J, Sporns O, Rees G, Friston KJ (2017) Large-scale DCMs for resting-state fMRI. Netw Neurosci 1:222–241CrossRefGoogle Scholar
  30. Romeres D, Zorzi M, Camoriano R, Traversaro S, Chiuso A (2019, in press) Derivative-free online learning of inverse dynamics models. IEEE Trans Control Syst TechnolGoogle Scholar
  31. Tibshirani R (1996) Regression shrinkage and selection via the LASSO. J R Stat Soc Ser B 58:267–288MathSciNetzbMATHGoogle Scholar
  32. Tipping M (2001) Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 1:211–244MathSciNetzbMATHGoogle Scholar
  33. Wang H, Li G, Tsai C (2007) Regression coefficient and autoregressive order shrinkage and selection via the LASSO. J R Stat Soc Ser B 69:63–78MathSciNetGoogle Scholar
  34. Wipf D, Rao B, Nagarajan S (2011) Latent variable Bayesian models for promoting sparsity. IEEE Trans Inf Theory 57:6236–6255MathSciNetCrossRefGoogle Scholar
  35. Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B 68:49–67MathSciNetCrossRefGoogle Scholar
  36. Zorzi M, Chiuso A (2017) Sparse plus low rank network identification: a nonparametric approach. Automatica 76:355–366MathSciNetCrossRefGoogle Scholar
  37. Zorzi M, Chiuso A (2018) The harmonic analysis of kernel functions. Automatica 94:125–137MathSciNetCrossRefGoogle Scholar
  38. Zou H (2006) The adaptive Lasso and it oracle properties. J Am Stat Assoc 101:1418–1429MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Information EngineeringUniversity of PadovaPadovaItaly

Section editors and affiliations

  • Lennart Ljung
    • 1
  1. 1.Division of Automatic Control, Department of Electrical EngineeringLinköping UniversityLinköpingSweden