# Experiment Design and Identification for Control

Latest version View entry history

**DOI:**https://doi.org/10.1007/978-1-4471-5102-9_103-2

- 175 Downloads

## Abstract

The experimental conditions have a major impact on the estimation result. Therefore, the available degrees of freedom in this respect that are at the disposal to the user should be used wisely. This entry provides the fundamentals for the available techniques. We also briefly discuss the particulars of identification for model-based control, one of the main applications of system identification.

## Keywords

Adaptive experiment design Application-oriented experiment design Cramér-Rao lower bound Crest factor Experiment design Fisher information matrix Identification for control Least-costly identification Multisine Pseudorandom binary signal (PRBS) Robust experiment design## Introduction

- (i)
Information content in the data used for estimation

- (ii)
The complexity of the model structure

*n*parameters modeling the dynamics, it follows from the invariance result in Rojas et al. (2009) that to obtain a model for which the variance of the frequency function estimate is less than 1∕

*γ*over all frequencies, the signal-to-noise ratio, as measured by input energy over noise variance, must be at least

*n*

*γ*. With energy being power × time and as input power is limited in physical systems, this indicates that the experiment time grows at least linearly with the number of model parameters. When the input energy budget is limited, the only way around this problem is to sacrifice accuracy over certain frequency intervals. The methodology to achieve this in a systematic way is known as experiment design.

## Model Quality Measures

*N*input–output samples) and

*θ*

_{o}the true parameters,

*N*→

*∞*, the inequality (1) typically holds asymptotically as the sample size

*N*grows to infinity. The right-hand side in (1) is then replaced by the inverse of the per sample Fisher information

*I*

_{F}(

*θ*

_{o}) :=lim

_{N→∞}

*I*

_{F}(

*θ*

_{o},

*N*)∕

*N*. An estimator is said to be asymptotically efficient if equality is reached in (1) as

*N*→

*∞*.

Even though it is possible to reduce the mean-square error by constraining the model flexibility appropriately, it is customary to use consistent estimators since the theory for biased estimators is still not well understood. For such estimators, using some function of the Fisher information as performance measure is natural.

### General-Purpose Quality Measures

Over the years a number of “general-purpose” quality measures have been proposed. Perhaps the most frequently used is the determinant of the inverse Fisher information. This represents the volume of confidence ellipsoids for the parameter estimates and minimizing this measure is known as D-optimal design. Two other criteria relating to confidence ellipsoids are E-optimal design, which uses the length of the longest principal axis (the minimum eigenvalue of *I*_{F}) as quality measure, and A-optimal design, which uses the sum of the squared lengths of the principal axes (the trace of \(I_F^{-1}\)).

### Application-Oriented Quality Measures

When demands are high and/or experimentation resources are limited, it is necessary to tailor the experiment carefully according to the intended use of the model. Below we will discuss a couple of closely related application-oriented measures.

#### Average Performance Degradation

*V*

_{app}(

*θ*) ≥ 0 be a measure of how well the model corresponding to parameter

*θ*performs when used in the application. In finance,

*V*

_{app}can, e.g., represent the ability to predict the stock market. In process industry,

*V*

_{app}can represent the profit gained using a feedback controller based on the model corresponding to

*θ*. Let us assume that

*V*

_{app}is normalized such that min

_{θ}

*V*

_{app}(

*θ*) =

*V*

_{app}(

*θ*

_{o}) = 0. That

*V*

_{app}has minimum corresponding to the parameters of the true system is quite natural. We will call

*V*

_{app}the application cost. Assuming that the estimator is asymptotically efficient, using a second-order Taylor approximation gives that the average application cost can be expressed as (the first-order term vanishes since

*θ*

_{o}is the minimizer of

*V*

_{app})

#### Acceptable Performance

*V*

_{app}above, this would be a level set

*γ*> 0. The objective of the experiment design is then to ensure that the resulting estimate ends up in \({{\mathcal {E}_{\mathrm {app}}}}\) with high probability.

## Design Variables

In an identification experiment, there are a number of design variables at the user’s disposal. Below we discuss three of the most important ones.

### Sampling Interval

For the sampling interval, the general advice from an information theoretic point of view is to sample as fast as possible (Ljung, 1999). However, sampling much faster than the time constants of the system may lead to numerical issues when estimating discrete time models as there will be poles close to the unit circle. Downsampling may thus be required.

### Feedback

- (i)
Not all the power in the input can be used to estimate the system dynamics when a noise model is estimated as a part of the input signal has to be used for the latter task; see Section 8.1 in Forssell and Ljung (1999). When a very flexible noise model is used, the estimate of the system dynamics then has to rely almost entirely on external excitation.

- (ii)
Feedback can reduce the effect of disturbances and noise at the output. When there are constraints on the outputs, this allows for larger (input) excitation and therefore more informative experiments.

- (iii)
The cross-correlation between input and noise/disturbances requires good noise models to avoid biased estimates (Ljung, 1999).

### External Excitation Signals

- (i)
First, optimization of the probability density function of the excitation

- (ii)
Generation of the actual sequence from the obtained density function through a stochastic simulation procedure

## Experimental Constraints

- (i)
*Variability.*For example, too high level of excitation may cause the end product to go off-spec, resulting in product waste and associated high costs. - (ii)
*Frequency content.*Often, too harsh movements of the inputs may damage equipment. - (iii)
*Amplitudes.*For example, actuators have limited range, restricting input amplitudes. - (iv)
*Waveforms.*In process industry, it is not uncommon that control equipment limits the type of signals that can be applied. In other applications, it may be physically possible to realize only certain types of excitation. See section “Waveform Generation” for further discussion.

## Experiment Design Criteria

- (i)
*Best effort.*Here the best quality as, e.g., given by one of the quality measures in section “Model Quality Measures” is sought under constraints on the experimental effort and cost. This is the classical problem formulation. - (ii)
*Least-costly.*The cheapest experiment is sought that results in a predefined model quality. Thus, as compared to best effort design, the optimization criterion and constraint are interchanged. This type of design was introduced by Bombois and coworkers; see Bombois et al. (2006).

## Computational Issues

*N*with the measured outputs collected in the vector

*σ*

^{2}

*I*

_{N×N}. Then it holds that

*I*

_{F}(

*θ*

_{o},

*N*) being a quadratic function of the input sequence, all typical quality measures become non-convex.

While various methods for non-convex numerical optimization can be used to solve such problems, they often encounter problems with, e.g., local minima. To address this, a number of techniques have been developed either where the problem is reparametrized so that it becomes convex or where a convex approximation is used. The latter technique is called convex relaxation and is often based on a reparametrization as well. We use the example above to provide a flavor of the different techniques.

### Reparametrization

If the input is constrained to be periodic so that *u*(*t*) = *u*(*t* + *N*), *t* = −*n*, …, −1, it follows that the Fisher information is linear in the sample correlations of the input. Using these as design variables instead of *u* results in that all quality measures referred to above become convex functions.

This reparametrization thus results in the two-step procedure discussed in section “External Excitation Signals”: First, the sample correlations are obtained from an optimal experiment design problem, and then an input sequence is generated that has this sample correlation. In the second step, there is a considerable freedom. Notice, however, that since correlations do not directly relate to the actual amplitudes of the resulting signals, it is difficult to incorporate waveform constraints in this approach. On the contrary, variance constraints are easy to incorporate.

### Convex Relaxations

There are several approaches to obtain convex relaxations.

#### Using the per Sample Fisher Information

If the input is a realization of a stationary random process and the sample size *N* is large enough, *I*_{F}(*θ*_{o}, *N*)∕*N* is approximately equal to the per sample Fisher matrix which only depends on the correlation sequence of the input. Using this approximation, one can now follow the same procedure as in the reparametrization approach and first optimize the input correlation sequence. The generation of a stationary signal with a certain correlation is a stochastic realization problem which can be solved using spectral factorization followed by filtering white noise sequence, i.e., a sequence of independent identically distributed random variables, through the (stable) spectral factor (Jansson and Hjalmarsson, 2005).

More generally, it turns out that the per sample Fisher information for linear models/systems only depends on the joint input/noise spectrum (or the corresponding correlation sequence). A linear parametrization of this quantity thus typically leads to a convex problem (Jansson and Hjalmarsson, 2005).

The set of all spectra is infinite dimensional, and this precludes a search over all possible spectra. However, since there is a finite-dimensional parametrization of the per sample Fisher information (it is a symmetric *n* × *n* matrix), it is also possible to find finite-dimensional sets of spectra that parametrize all possible per sample Fisher information matrices. Multisine with appropriately chosen frequencies is one possibility. However, even though all per sample Fisher information matrices can be generated, the solution may be suboptimal depending on which constraints the problem contains.

The situation for nonlinear problems is conceptually the same, but here the entire probability density function of the stationary process generating the input plays the same role as the spectrum in the linear case. This is a much more complicated object to parametrize.

#### Lifting

*U*=

*uu*

^{T}, representing all possible products of the elements of

*u*. This constraint is equivalent to The idea of lifting is now to observe that the Fisher information matrix is linear in the elements of

*U*and by dropping the rank constraint in (5) a convex relaxation is obtained, where both

*U*and

*u*(subject to the matrix inequality in (5)) are decision variables.

#### Frequency-by-Frequency Design

*n*th-order estimate, \(G(e^{i\omega },\hat {\theta }_N)\), of the frequency function can approximately be expressed as

_{u}and Φ

_{v}are the input and noise spectra, respectively. Performance measures of the type (2) can then be written as

*W*(

*e*

^{iω}) ≥ 0 depends on the application. When only variance constraints are present, such problems can be solved frequency by frequency, providing both simple calculations and insight into the design.

## Implementation

We have used the notation *I*_{F}(*θ*_{o}, *N*) to indicate that the Fisher information typically (but not always) depends on the parameter corresponding to the true system. That the optimal design depends on the to-be identified system is a fundamental problem in optimal experiment design. There are two basic approaches to address this problem which are covered below. Another important aspect is the choice of waveform for the external excitation signal. This is covered last in this section.

### Robust Experiment Design

In robust experiment design, it is assumed that it is known beforehand that the true parameter belongs to some set, i.e., *θ*_{o} ∈ Θ. A minimax approach is then typically taken, finding the experiment that minimizes the worst performance over the set Θ. Such optimization problems are computationally very difficult.

### Adaptive Experiment Design

The alternative to robust experiment design is to perform the design adaptively or sequentially, meaning that first a design is performed based on some initial “guess” of the true parameter, and then as samples are collected, the design is revised taking advantage of the data information. Interestingly, the convergence rate of the parameter estimate is typically sufficiently fast that for this approach the asymptotic distribution is the same as for the design based on the true model parameter (Hjalmarsson, 2009).

### Waveform Generation

We have argued above that it is the spectrum of the excitation (together with the feedback) that determines the achieved model accuracy in the linear time-invariant case. In section “Using the per Sample Fisher Information,” we argued that a signal with a particular spectrum can be obtained by filtering a white noise sequence through a stable spectral factor of the desired spectrum. However, we have also in section “Experimental Constraints” argued that particular applications may require particular waveforms. We will here elaborate further on how to generate a waveform with desired characteristics.

*Persistence of excitation.*A signal with a spectrum having*n*nonzero frequencies (on the interval (−*π*,*π*]) can be used to estimate at most*n*parameters. Thus, as is typically the case, if there is uncertainty regarding which model structure to use before the experiment, one has to ensure that a sufficient number of frequencies is excited.*The crest factor.*For all systems, the maximum input amplitude, say*A*, is constrained. To deal with this from an experiment design point of view, it is convenient to introduce what is called the crest factor of a signal:The crest factor is thus the ratio between the squared maximum amplitude and the power of the signal. Now, for a class of signal waveforms with a given crest factor, the input power that can be used is upper-bounded by$$\displaystyle \begin{aligned} C_r^2=\frac{\max_t u^2(t)}{\lim_{N\rightarrow\infty}\frac{1}{N}\sum_{t=1}^{N}u^2(t)} \end{aligned} $$However, the power is the integral of the signal spectrum, and since increasing the amplitude of the input signal spectrum will increase a model’s accuracy, cf. (6), it is desirable to use as much signal power as possible. By (7) we see that this means that waveforms with low crest factor should be used.$$\displaystyle \begin{aligned} \lim_{N\rightarrow\infty}\frac{1}{N}\sum_{t=1}^{N}u^2(t)\leq \frac{A^2}{C_r^2} \end{aligned} $$(7)

*r*

_{τ}gives a binary signal having correlation sequence \(\tilde {r}_\tau =2/\pi \arcsin (r_\tau )\). With \(\tilde {r}_\tau \) given, one can try to solve this relation for the corresponding

*r*

_{τ}.

A crude, but often sufficient, method to generate binary sequences with desired spectral content is based on the use of *pseudorandom binary signals (PRBS)*. Such signals (which are generated by a shift register) are periodic signals which have correlation sequences similar to random white noise, i.e., a flat spectrum. By resampling such sequences, the spectrum can be modified. It should be noted that binary sequences are less attractive when it comes to identifying nonlinearities. This is easy to understand by considering a static system. If only one amplitude of the input is used, it will be impossible to determine whether the system is nonlinear or not.

*M*, each such term corresponds to one frequency on the grid 2

*πk*∕

*M*,

*k*= 0, …,

*M*− 1. Such a signal can thus be used to estimate at most

*M*parameters. Another way to generate a signal with period

*M*is to add sinusoids corresponding to the above frequencies, with desired amplitudes. A periodic signal generated in this way is commonly referred to as a

*Multisine*. The crest factor of a multisine depends heavily on the relation between the phases of the sinusoids. times the number of sinusoids. It is possible to optimize the crest factor with respect to the choice of phases (Rivera et al., 2009). There exist also simple deterministic methods for choosing phases that give a good crest factor, e.g., Schroeder phasing. Alternatively, phases can be drawn randomly and independently, giving what is known as random-phase multisines (Pintelon and Schoukens, 2012), a family of random signals with properties similar to Gaussian signals. Periodic signals have some useful features:

*Estimation of nonlinearities.*A linear time-invariant system responds to a periodic input signal with a signal consisting of the same frequencies but with different amplitudes and phases. Thus, it can be concluded that the system is nonlinear if the output contains other frequencies than the input. This can be explored in a systematic way to estimate also the nonlinear part of a system.*Estimation of noise variance.*For a linear time-invariant system, the difference in the output between different periods is due entirely to the noise if the system is in steady state. This can be used to devise simple methods to estimate the noise level.*Data compression.*By averaging measurements over different periods, the noise level can be reduced at the same time as the number of measurements is reduced.

## Implications for the Identification Problem Per Se

*V*

_{app}also is quadratic in

*θ*, it follows after a little bit of algebra (see Hjalmarsson 2009) that it must hold that

*c*that is not important for our discussion. The value of \(\mathrm {E}[V_{N}(\theta )]=\|\theta -\theta _o\|{ }_{\Phi ^T\Phi }^2+\sigma ^2\) is determined by how large the weighting Φ

^{T}Φ is, which in turn depends on how large the input

*u*is. In a least-costly setting with the energy ∥

*u*∥

^{2}as criterion, the best solution would be that we have equality in (8). Thus we see that optimal experiment design tries to shape the identification criterion after the application cost. We have the following implications of this result:

- (i)
*Perform identification under appropriate scaling of the desired operating conditions.*Suppose that*V*_{app}(*θ*) is a function of how the system outputs deviate from a desired trajectory (determined by*θ*_{o}). Performing an experiment which performs the desired trajectory then gives that the sum of the squared prediction errors are an approximation of*V*_{app}(*θ*), at least for parameters close to*θ*_{o}. Obtaining equality in (8) typically requires an additional scaling of the input excitation or the length of the experiment. The result is intuitively appealing: The desired operating conditions should reveal the system properties that are important in the application. - (ii)
*Identification cost for application performance.*We see that the required energy grows (almost) linearly with*γ*, which is a measure of how close to the ideal performance (using the true parameter*θ*_{o}) we want to come. Furthermore, it is typical that as the performance requirements in the application increase, the sensitivity to model errors increases. This means that*V*_{app}(*θ*) increases, which thus in turn means that the identification cost increases. In summary, the identification cost will be higher, the higher performance that is required in the application. The inequality (8) can be used to quantify this relationship. - (iii)
*Model structure sensitivity.*As*V*_{app}will be sensitive to system properties important for the application, while insensitive to system properties of little significance, with the identification criterion*V*_{N}matched to*V*_{app}, it is only necessary that the model structure is able to model the important properties of the system.In any case, whatever model structure that is used, the identified model will be the best possible in that structure for the intended application. This is very different from an arbitrary experiment where it is impossible to control the model fit when a model of restricted complexity is used.

We conclude that optimal experiment design simplifies the overall system identification problem.

## Identification for Control

Model-based control is one of the most important applications of system identification. Robust control ensures performance and stability in the presence of model uncertainty. However, the majority of such design methods do not employ the parametric ellipsoidal uncertainty sets resulting from standard system identification. In fact only in the last decade analysis and design tools for such type of model uncertainty have started to emerge, e.g., Raynaud et al. (2000) and Gevers et al. (2003).

The advantages of matching the identification criterion to the application have been recognized since long in this line of research. For control applications this typically implies that the identification experiment should be performed under the same closed-loop operation conditions as the controller to be designed. This was perhaps first recognized in the context of minimum variance control (see Gevers and Ljung 1986) where variance errors were the concern. Later on this was recognized to be the case also for the bias error, although here pre-filtering can be used to achieve the same objective.

To account for that the controller to be designed is not available, techniques where control and identification are iterated have been developed, cf. adaptive experiment design in section “Adaptive Experiment Design.” Convergence of such schemes has been established when the true system is in the model set but has proved out of reach for models of restricted complexity.

In recent years, techniques integrating experiment design and model predictive control have started to appear. A general-purpose design criterion is used in Rathouský and Havlena (2013), while Larsson et al. (2013) uses an application-oriented criterion.

## Summary and Future Directions

When there is the “luxury” to design the experiment, then this opportunity should be seized by the user. Without informative data there is little that can be done. In this exposé we have outlined the techniques that exist but also emphasized that a well-conceived experiment, reflecting the intended application, significantly can simplify the overall system identification problem.

Further developments of computational techniques are high on the agenda, e.g., how to handle time-domain constraints and nonlinear models. To this end, developments in optimization methods are rapidly being incorporated. While, as reported in Hjalmarsson (2009), there are some results on how the identification cost depends on the performance requirements in the application, further understanding of this issue is highly desirable. Theory and further development of the emerging model predictive control schemes equipped with experiment design may very well be the direction that will have most impact in practice. Nonparametric kernel methods have proven to be highly competitive estimation methods and developments of experiment design methods for such estimators only started recently in the system identification community; see, e.g., Fujimoto et al. (2018).

## Cross-References

## Notes

### Acknowledgements

This work was supported by the European Research Council under the advanced grant LEARN, contract 267381, and by the Swedish Research Council, contracts 2016-06079.

## Bibliography

- Agüero JC, Goodwin GC (2007) Choosing between open and closed loop experiments in linear system identification. IEEE Trans Autom Control 52(8):1475–1480MathSciNetCrossRefGoogle Scholar
- Bombois X, Scorletti G, Gevers M, Van den Hof PMJ, Hildebrand R (2006) Least costly identification experiment for control. Automatica 42(10):1651–1662CrossRefGoogle Scholar
- Fedorov VV (1972) Theory of optimal experiments. Probability and mathematical statistics, vol 12. Academic, New YorkGoogle Scholar
- Forssell U, Ljung L (1999) Closed-loop identification revisited. Automatica 35:1215–1241MathSciNetCrossRefGoogle Scholar
- Fujimoto Y, Maruta I, Sugie T (2018) Input design for kernel-based system identification from the viewpoint of frequency response. IEEE Trans Autom Control 63(9):3075–3082MathSciNetCrossRefGoogle Scholar
- Gevers M (2005) Identification for control: from the early achievements to the revival of experiment design. Eur J Control 11(4–5):335–352. Semi-plenary lecture at IEEE conference on decision and control – European control conferenceMathSciNetCrossRefGoogle Scholar
- Gevers M, Ljung L (1986) Optimal experiment designs with respect to the intended model application. Automatica 22(5):543–554MathSciNetCrossRefGoogle Scholar
- Gevers M, Bombois X, Codrons B, Scorletti G, Anderson BDO (2003) Model validation for control and controller validation in a prediction error identification framework – part I: theory. Automatica 39(3):403–445MathSciNetCrossRefGoogle Scholar
- Goodwin GC, Payne RL (1977) Dynamic system identification: experiment design and data analysis. Academic, New YorkzbMATHGoogle Scholar
- Hjalmarsson H (2005) From experiment design to closed loop control. Automatica 41(3):393–438MathSciNetCrossRefGoogle Scholar
- Hjalmarsson H (2009) System identification of complex and structured systems. Eur J Control 15(4):275–310. Plenary address. European control conferenceMathSciNetCrossRefGoogle Scholar
- Jansson H, Hjalmarsson H (2005) Input design via LMIs admitting frequency-wise model specifications in confidence regions. IEEE Trans Autom Control 50(10):1534–1549MathSciNetCrossRefGoogle Scholar
- Larsson CA, Hjalmarsson H, Rojas CR, Bombois X, Mesbah A, Modén P-E (2013) Model predictive control with integrated experiment design for output error systems. In: European control conference, ZurichCrossRefGoogle Scholar
- Ljung L (1999) System identification: theory for the user, 2nd edn. Prentice-Hall, Englewood CliffszbMATHGoogle Scholar
- Manchester IR (2010) Input design for system identification via convex relaxation. In: 49th IEEE conference on decision and control, Atlanta, pp 2041–2046Google Scholar
- Pintelon R, Schoukens J (2012) System identification: a frequency domain approach, 2nd edn. Wiley/IEEE, Hoboken/PiscatawayCrossRefGoogle Scholar
- Pronzato L (2008) Optimal experimental design and some related control problems. Automatica 44(2):303–325MathSciNetCrossRefGoogle Scholar
- Rathouský J, Havlena V (2013) MPC-based appro- ximate dual controller by information matrix maximization. Int J Adapt Control Signal Process 27(11):974–999CrossRefGoogle Scholar
- Raynaud HF, Pronzato L, Walter E (2000) Robust identification and control based on ellipsoidal parametric uncertainty descriptions. Eur J Control 6(3):245–255MathSciNetCrossRefGoogle Scholar
- Rivera DE, Lee H, Mittelmann HD, Braun MW (2009) Constrained multisine input signals for plant-friendly identification of chemical process systems. J Process Control 19(4):623–635CrossRefGoogle Scholar
- Rojas CR, Agüero JC, Welsh JS, Goodwin GC (2008) On the equivalence of least costly and traditional experiment design for control. Automatica 44(11):2706–2715MathSciNetCrossRefGoogle Scholar
- Rojas CR, Welsh JS, Agüero JC (2009) Fundamental limitations on the variance of parametric models. IEEE Trans Autom Control 54(5):1077–1081MathSciNetCrossRefGoogle Scholar
- Zarrop M (1979) Optimal experiment design for dynamic system identification. Lecture notes in control and information sciences, vol 21. Springer, BerlinCrossRefGoogle Scholar