Exercise 2.1: Problem 1 For each subset of the

X and

Y matrices, the coefficients of

A were calculated using the following LSM procedure:

$$ A={\left({X}^T\times X\right)}^{-1}\times {X}^T\times Y $$

The following is the set of coefficient obtained from respective number of data points and their “true” values:

It could be seen that the greater the number of data points, the more accurate the approximation of A coefficients is.

Exercise 2.1: Problem 2 For each coefficient of the suggested model, a 95% confidence interval was built based on the error of the model and the respective diagonal elements of the covariance matrix, i.e.

q _{ ii } $$ Q={K}_{xx}^{-1}={\left[\frac{1}{N}\times {X}^T\times X\right]}^{-1} $$

The half-width for each confidence interval was calculated as

$$ {\varDelta}_{\mathrm{i}}=t\left(\alpha =.025,N=300\right)\times {\sigma}_E^2\times \sqrt{\frac{q_{\mathrm{i}\mathrm{i}}}{N}} $$

The 95 % confidence interval for the model coefficient

a _{1} is 1.9561 to 2.0435 and the “true”

a _{1} is 2, so the true parameter lies within the interval.

The 95 % confidence interval for the model coefficient a _{2} is 2.4562 to 3.5084 and the “true” a _{2} is 3, so the true parameter lies within the interval.

The 95 % confidence interval for the model coefficient a _{3} is −2.0745 to −1.9296 and the “true” a _{3} is −2, so the true parameter lies within the interval.

The 95 % confidence interval for the model coefficient a _{4} is 4.832 to 5.1797 and the “true” a _{4} is 5, so the true parameter lies within the interval.

Exercise 2.1: Problem 3 Given the following set of input values:

$$ \widetilde{X}=\left[2.5\quad 3\quad -6.3\quad 10\right] $$

and matrix

\( Q={K}_{xx}^{-1}={\left[\frac{1}{N}\times {X}^T\times X\right]}^{-1} \) The 95 % confidence interval half-width was calculated as

$$ \Delta = t\left( {\alpha = .025,N = 300} \right) \times \sigma _E^2 \times \sqrt {\frac{{{{\widetilde X}^T} \times Q \times \widetilde X}}{N}} $$

The 95 % confidence interval for output Y is 75.9734 to 77.2616 and the “true” Y is 76.31, so the true Y lies within the interval.

Exercise 2.1: Problem 4 The required covariance matrices are:

K_{xy} =

12.9700

−19.7400

7.2130

−8.6490

0.6100

The model parameters were estimated by the following procedure:

$$ A={\left({K}_{xx}\right)}^{-1}\times {K}_{x\mathrm{y}} $$

The calculated model parameters

A are:

3.0896

−2.4294

1.5077

−2.0313

1.6105

The estimation errors were calculated with the following procedure:

$$ {Error}_{noise}={\left[{\left({K}_{xx}-{K}_{noise}\right)}^T\times \left({K}_{xx}-{K}_{noise}\right)\right]}^{-1}\times {K}_{x\mathrm{y}}-{\left({K}_{xx}^T\times {K}_{xx}\right)}^{-1}\times {K}_{x\mathrm{y}} $$

The parameter estimation errors cause by this known noise are:

0.8977

−0.4328

0.0208

0.0433

0.4748

Exercise 2.1: Problem 5 First, matrix

Z was calculated from matrix

X and matrix

W .

“Artificial” coefficients

B were calculated from

Z .

$$ B={\left({Z}^T\times Z\right)}^{-1}\times {Z}^T\times Y $$

Then, the variance of

Y was calculated and the variance for each

B was calculated.

$$ {\sigma_{z\left(\mathrm{i}\right)}}^2={\lambda}_{\mathrm{i}}-{\left(\overline{z_{\mathrm{i}}}\right)}^2 $$

$$ {\sigma_Y}^2=\sum b{\left(\mathrm{i}\right)}^2\times {\sigma_{z\left(\mathrm{i}\right)}}^2 $$

The percent of contribution from each

Z was calculated as follows:

$$ \%{z}_{\mathrm{i}}=\frac{b{\left(\mathrm{i}\right)}^2\times {\sigma_{z\left(\mathrm{i}\right)}}^2}{{\sigma_Y}^2}\times 100\% $$

The contribution of z_{1} is 68.241 %

The contribution of z_{2} is 14.5954 %

The contribution of z_{3} is 17.1277 %

The contribution of z_{4} is 0.035818 %

Because of this, we will keep z_{1} , z_{2} , and z_{3} .

The new vector B is:

−3.2903

1.6415

−1.9924

0

Next, we calculated the “real” coefficients

A based on “artificial” coefficients

B _{ new } :

$$ {A}_{important}={W}_{new}\times {B}_{new} $$

Our calculated coefficients

A are :

1.9992

2.9998

1.4990

1.4990

The coefficient of determination for the new model is 0.99863.

Exercise 2.2: Problem 1 Although the following analysis does not include “all possible combinations of first and second order regressors, it demonstrates the principle of establishing the model configuration using the coefficient of determination

Equation 1 with x

_{1} , x

_{2} , x

_{3} , x

_{1} x

_{3} , x

_{2} ^{2} has coefficients

$$ {A}_1=\left[1.9989\kern0.5em 2.9983\kern0.5em -0.4002\kern0.5em 0.5003\kern0.5em 1.0009\right] $$

Equation 2 with x

_{1} , x

_{2} , x

_{3} , x

_{1} x

_{3} has coefficients

$$ {A}_2=\left[1.9989\kern0.5em 2.9983\kern0.5em -0.4002\kern0.5em 0.5003\right] $$

Equation 3 with x

_{1} , x

_{2} , x

_{3} , x

_{2} ^{2} has coefficients

$$ {A}_3=\left[1.9989\kern0.5em 2.9983\kern0.5em -0.4002\kern0.5em 1.0009\right] $$

Equation 4 with x

_{1} , x

_{2} , x

_{1} x

_{3} , x

_{2} ^{2} has coefficients

$$ {A}_4=\left[1.9989\kern0.5em 2.9983\kern0.5em 0.5003\kern0.5em 1.0009\right] $$

Equation 5 with x

_{1} , x

_{3} , x

_{1} x

_{3} , x

_{2} ^{2} has coefficients

$$ {A}_5=\left[1.9989\kern0.5em -0.4002\kern0.5em 0.5003\kern0.5em 1.0009\right] $$

Equation 6 with x

_{2} , x

_{3} , x

_{1} x

_{3} , x

_{2} ^{2} has coefficients

$$ {A}_6=\left[2.9983\kern0.5em -0.4002\kern0.5em 0.5003\kern0.5em 1.0009\right] $$

Equation 7 with x

_{1} , x

_{2} , x

_{1} x

_{3} has coefficients

$$ {A}_7=\left[1.9989\kern0.5em 2.9983\kern0.5em 0.5003\right] $$

Equation 8 with x

_{1} , x

_{2} , x

_{2} ^{2} has coefficients

$$ {A}_8=\left[1.9989\kern0.5em 2.9983\kern0.5em 1.0009\right] $$

For Equations 1–8, the respective natural variance (S

_{y} ), error variance (S

_{e} ), and coefficient of determination (C

_{D} ) values are:

Equation 7, y = 2x_{1} + 3x_{2} + 0.5x_{1} x_{3} , seems to be a rational model in terms of complexity and accuracy

Exercise 2.2: Problem 2 The obtained RLSM and “true” parameters are:

The coefficient of determination for this model is 0.99994. The plot showing the convergence of the RLSM procedure is shown below. It could be seen that RLSM estimation of constant parameters results in the same parameter values that could be obtained by the LSM.

Exercise 2.2: Problem 3 It could be seen that with the forgetting factor of 1. the RLSM procedure does not allow for tracking of drifting “true” parameters. The “final” parameter values are (see the plot below):

The coefficient of determination for the resultant model is 0.40497, compare with the value of 0.99994 for problem 2. These results are unusable, but justify the use of forgetting factor value of less than 1.

Exercise 2.2: Problem 4 RLSM results with the forgetting factor (Beta) Beta = 0.1 are shown below:

When Beta = 0.2, the results for A are:

When Beta = 0.3, the results are:

When Beta = 0.4, the converged coefficients of A are:

When Beta = 0.6, the converged coefficients of A are:

When Beta = 0.9 and Beta = 1. the tracking results are:

It could be seen that as Beta approaches the value of 1.0 the tracking ability of the RLSM procedure diminishes.

Exercise 2.3: Problem 1 Running program GENERATOR.EXE results in an array of 1000 measurements of a complex process with 15 input variables and a discrete event-type output that is rated as outcome A or outcome B. This data is recorded in file TSMD.DAT. Running program CLUSTER.EXE results in the display of the three most informative subspaces featuring distributions of events A and B in the appropriated subspaces.

Subspace: X

_{1} & X

_{4} - Separation Line is: X

_{1} + 1.8139X

_{4} − 1.4506 = 0

Subspace: X

_{1} & X

_{5} - Separating Line: X

_{1} + 2.858X

_{5} – 2.0003 = 0

Subspace: X_{1} & X_{10} - Separating Line: X_{1} + 6.6622X_{10} – 3.6642 = 0

Events E1–E8 represent location of a measurement point within the domain A or domain B in the above subspaces. Below are the probabilities of these events for outcomes A and B:

Prediction of the process outcome based on the particular measurement, X(1).

Although vector X(t) has 15 components, our analysis indicates that the values of the following four components are to be considered:

$$ {X}_1(1)=0.633,\ {X}_4(1)=0.814,\ {X}_5(1)=0.371,\ {X}_{10}(1)=0.363 $$

Compute values of functions Φ1, Φ2, and Φ3 for the selected components of vector X(1) and based on these results define the location of the point X(1) as the appropriate event:

$$ \Phi 1=X1(1)+K*X4(1)+\mathrm{Q}=0.633+1.8139*0.814\ \hbox{--}\ 1.4506=0.6589146 $$

$$ \Phi 2=X1(1)+K*X5(1)+\mathrm{Q}=0.633+2.858*0.371\ \hbox{--}\ 2.0003=-0.306982 $$

$$ \Phi 3=X1(1)+K*X10(1)+\mathrm{Q}=0.633+6.6622*0.363\ \hbox{--}\ 3.6642=-0.6128214 $$

The resultant event is E5, then

$$ P(A)=0.563\kern1em P(B)=0.437\kern1em P\left(E5\Big|A\right)=0.140\kern1em P\left(E5\Big|\mathrm{B}\right)=0.057 $$

$$ P\left(A|E5\right)=\frac{P\left(E5|A\right)*P(A)}{P\left(E5|A\right)*P(A)+P\left(E5|B\right)*P(B)}=\frac{0.140*0.563}{0.140*0.563+0.057*0.437}=.\mathbf{7599} $$

$$ P\left(B|E5\right)=\frac{P\left(E5|B\right)*P(B)}{P\left(E5|B\right)*P(B)+P\left(E5|A\right)*P(A)}=\frac{0.057*0.437}{0.057*0.437+0.140*0.563}=.\mathbf{2401} $$

Consider the next measurement vector X(2) and repeat the above procedure:

$$ {X}_1(2) = 0.255,\ {X}_4(2) = 0.967,\ {X}_5(2) = 0.884,\ {X}_{10}(2) = 0.067 $$

$$ \Phi 1=X1(2)+K*X4(2)+\mathrm{Q}=0.255+1.8139*0.967\ \hbox{--}\ 1.4506=0.5584413 $$

$$ \Phi 2=X1(2)+K*X5(2)+\mathrm{Q}=0.255+2.858*0.884\ \hbox{--}\ 2.0003=0.781172 $$

$$ \Phi 3=X1(2)+K*X10(2)+\mathrm{Q}=0.255+6.6622*0.067\ \hbox{--}\ 3.6642=-2.9628326 $$

That results in Event E7, therefore

$$ P\left(E7\Big|A\right)=0.176\kern1em P\left(E7\Big|\mathrm{B}\right)=0.016\kern1em P(A)=\mathbf{0.7599}\kern1em P(B)=\mathbf{0.2401} $$

$$ P\left(A|E7\right)=\frac{P\left(E7|A\right)*P(A)}{P\left(E7|A\right)*P(A)+P\left(E7|\mathrm{B}\right)*P\left(\mathrm{B}\right)}=\frac{0.176*0.7599}{0.176*0.7599+0.016*0.2401}=.\mathbf{9721} $$

$$ P\left(B|E7\right)=\frac{P\left(E7|\mathrm{B}\right)*P\left(\mathrm{B}\right)}{P\left(E7|B\right)*P(B)+P\left(E7|A\right)*P(A)}=\frac{0.016*0.2401}{0.016*0.2401+0.176*0.7599}=.\mathbf{0279} $$

The probability that these two sets of X(t) values yields result B is less than 0.03, while the probability that the outcome would be A is above 0.97. It can be said with much certainty that the outcome associated with these two X(t) sets would be A.

Exercise 2.3: Problem 2 For this problem, a 400 × 600 random matrix A _{ 0 } , matrix B _{ 0 } , and matrix C _{ 0 } were generated.

SVD was performed in MATLAB on matrix A _{ 0 } to retrieve the first two left and right vectors.

Then, a set of matrices

A _{ 0 } (k)+noise (20 % of the original magnitude used to generate matrix

A ), k=1,2,3,10 and two-coordinate points {

W _{1} (k),

W _{2} (k)} were defined by multiplication:

$$ W{(k)}_1={L_{A1}}^T\times A(k)\times {R}_{A1} $$

$$ W{(k)}_2={L_{A2}}^T\times A(k)\times {R}_{A2} $$

k =1,2,…10

This process was repeated still using the left and right vectors of the original matrix A _{ 0 } , but instead of A(k) matrices B(k) and C(k), generated by adding noise to B _{ 0 } and C _{ 0 } , were used, and a sequence of points {W _{1} (k),
W _{
2} (k)}, k=10+1, 10+2,…20, 20+1, 20+2,…,30 were established.

All of the points were plotted on a W

_{1} –W

_{2} plane. As it can be seen, this SVD-based procedure results in the clustering pattern revealing in spite of noise the three classes of matrices originated from matrix

A _{ 0 } ,

B _{ 0 } ,

C _{ 0 } .

Indeed, the SVD could be used as a tool for classification of large groups of data sets.