Decentralized Fuzzy-Neural Identification and I-Term Adaptive Control of Distributed Parameter Bioprocess Plant

Baruch, Ieroham; Saldierna, Eloy Echeverria

doi:10.1007/978-3-662-43370-6_1

Ieroham Baruch⁵ &
Eloy Echeverria Saldierna⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 561))

740 Accesses

Abstract

The chapter proposed to use of a Recurrent Neural Network Model (RNNM) incorporated in a fuzzy-neural multi model for decentralized identification of an aerobic digestion process, carried out in a fixed bed and a recirculation tank anaerobic wastewater treatment system. The analytical model of the digestion bioprocess represented a distributed parameter system, which is reduced to a lumped system using the orthogonal collocation method, applied in four collocation points. The proposed decentralized RNNM consists of five independently working Recurrent Neural Networks (RNN), so to approximate the process dynamics in four different measurement points plus the recirculation tank. The RNN learning algorithm is the second order Levenberg-Marquardt one. The comparative graphical simulation results of the digestion wastewater treatment system approximation, obtained via decentralized RNN learning, exhibited a good convergence, and precise plant variables tracking. The identification results are used for I-term direct and indirect (sliding mode) control obtaining good results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn, Section 2.13, 84–89; Section 4.13, pp. 208–213. Prentice-Hall, Upper Saddle River (1999)
Google Scholar
Narendra, K.S., Parthasarathy, K.: Identification and control of dynamical systems using neural networks. IEEE Trans. Neural Networks 1(1), 4–27 (1990)
Article Google Scholar
Chen, S., Billings, S.A.: Neural networks for nonlinear dynamics system modelling and identification. Int. J. Control 56(2), 319–346 (1992)
Article MATH MathSciNet Google Scholar
Hunt, K.J., Sbarbaro, D., Zbikowski, R., Gawthrop, P.J.: Neural network for control systems (A survey). Automatica 28, 1083–1112 (1992)
Article MATH MathSciNet Google Scholar
Miller III, W.T., Sutton, R.S., Werbos, P.J.: Neural Networks for Control. MIT Press, London (1992)
Google Scholar
Pao, S.A., Phillips, S.M., Sobajic, D.J.: Neural net computing and intelligent control systems. Int. J. Control 56(3), 263–289 (1992). (Special issue on Intelligent Control)
Article MATH MathSciNet Google Scholar
Su, H.-T., McAvoy, T.J., Werbos, P.: Long-term predictions of chemical processes using recurrent neural networks: a parallel training approach. Ind. Eng. Chem. Res. 31(5), 1338–1352 (1992)
Google Scholar
Boskovic, J.D., Narendra, K.S.: Comparison of linear, nonlinear and neural-network-based adaptive controllers for a class of fed-batch fermentation processes. Automatica 31, 817–840 (1995)
Article MathSciNet Google Scholar
Omatu, S., Khalil, M., Yusof, R.: Neuro-Control and Its Applications. Springer, London (1995)
Google Scholar
Baruch, I.S., Garrido, R.: A direct adaptive neural control scheme with integral terms. Int. J. Intell. Syst. 20(2), 213–224 (2005). ISSN 0884-8173. (Special issue on Soft Computing for Modelling, Simulation and Control of Nonlinear Dynamical Systems, Castillo, O., Melin, P. guest ed. Wiley Inter-Science)
Google Scholar
Bulsari, A., Palosaari, S.: Application of neural networks for system identification of an adsorption column. Neural Comput. Appl. 1, 160–165 (1993)
Article Google Scholar
Deng, H., Li, H.X.: Hybrid intelligence based modelling for nonlinear distributed parameter process with applications to the curing process. IEEE Trans. Syst. Man Cybern. 4, 3506–3511 (2003)
Google Scholar
Deng, H., Li, H.X.: Spectral-approximation-based intelligent modelling for distributed thermal processes. IEEE Trans. Control Syst. Technol 13, 686–700 (2005)
Article Google Scholar
Gonzalez-Garcia, R., Rico-Martinez, R., Kevrekidis, I.: Identification of distributed parameter systems: a neural net based approach. Comput. Chem. Eng. 22(4-supl. 1), 965–968 (1998)
Google Scholar
Padhi, R., Balakrishnan, S., Randolph, T.: Adaptive critic based optimal neuro- control synthesis for distributed parameter systems. Automatica 37, 1223–1234 (2001)
Article MATH MathSciNet Google Scholar
Padhi, R., Balakrishnan, S.: Proper orthogonal decomposition based optimal neuro-control synthesis of a chemical reactor process using approximate dynamic programming. Neural Networks 16, 719–728 (2003)
Article Google Scholar
Pietil, S., Koivo, H.N.: Centralized and decentralized neural network models for distributed parameter systems. In: Proceedings of the Symposium on Control, Optimization and Supervision, CESA’96, IMACS Multiconference on Computational Engineering in Systems Applications, Lille, France, pp. 1043–1048 (1996)
Google Scholar
Lin, C.T., Lee, C.S.G.: Neural Fuzzy Systems: A Neuro—Fuzzy Synergism to Intelligent Systems. Prentice Hall, Englewood Cliffs (1996)
Google Scholar
Babuska, R.: Fuzzy Modeling for Control. Kluwer, Norwell (1998)
Book Google Scholar
Baruch, I., Beltran-Lopez, R., Olivares-Guzman, J.L., Flores, J.M.: A fuzzy-neural multi-model for nonlinear systems identification and control. Fuzzy Sets Syst. 159, 2650–2667 (2008)
Article MATH Google Scholar
Takagi, T., Sugeno, M.: Fuzzy identification of systems and its applications to modelling and control. IEEE Trans. Syst. Man Cybern. 15, 116–132 (1985)
Article MATH Google Scholar
Teixeira, M., Zak, S.: Stabilizing controller design for uncertain nonlinear systems, using fuzzy models. IEEE Trans. Syst. Man Cybern. 7, 133–142 (1999)
Google Scholar
Mastorocostas, P.A., Theocharis, J.B.: A recurrent fuzzy-neural model for dynamic system identification. IEEE Trans. Syst. Man Cybern. B Cybern. 32, 176–190 (2002)
Article Google Scholar
Mastorocostas, P.A., Theocharis, J.B.: An orthogonal least-squares method for recurrent fuzzy-neural modeling. Fuzzy Sets Syst. 140(2), 285–300 (2003)
Article MATH MathSciNet Google Scholar
Galvan-Guerra, R., Baruch, I.S.: Anaerobic digestion process identification using recurrent neural multi-model. In: Gelbukh, A., Kuri-Morales, A.F. (eds.) Sixth Mexican International Conference on Artificial Intelligence, 4–10 Nov 2007, Aguascalientes, Mexico, Special Session, Revised Papers, CPS, pp. 319–329. IEEE Computer Society, Los Alamos. ISBN 978-0-7695-3124-3 (2008)
Google Scholar
Baruch, I., Olivares-Guzman, J.L., Mariaca-Gaspar, C.R., Galvan-Guerra, R.: A sliding mode control using fuzzy-neural hierarchical multi-model identifier. In: Castillo, O., Melin, P., Ross, O.M., Cruz, R.S., Pedrycz, W., Kacprzyk, J. (eds.) Theoretical Advances and Applications of Fuzzy Logic and Soft Computing, ASC, vol. 42, pp. 762–771. Springer, Berlin (2007)
Google Scholar
Baruch, I., Olivares-Guzman, J.L., Mariaca-Gaspar, C.R., Galvan-Guerra, R.: A fuzzy-neural hierarchical multi-model for systems identification and direct adaptive control. In: Melin, P., Castillo, O., Ramirez, E.G., Kacprzyk, J., Pedrycz, W. (eds.) Analysis and Design of Intelligent Systems Using Soft Computing Techniques, ASC, vol. 41, pp. 163–172. Springer, Berlin (2007)
Google Scholar
Aguilar-Garnica, F., Alcaraz-Gonzalez, V., Gonzalez-Alvarez, V.: Interval observer design for an anaerobic digestion process described by a distributed parameter model. In: Proceedings of the 2nd International Meeting on Environmental Biotechnology and Engineering (2IMEBE), CINVESTAV-IPN, Mexico City, paper 117 (2006), pp. 1–16
Google Scholar
Bialecki, B., Fairweather, G.: Orthogonal spline collocation methods for partial differential equations. J. Comput. Appl. Math. 128, 55–82 (2001)
Article MATH MathSciNet Google Scholar
Baruch, I.S., Mariaca-Gaspar, C.R.: A Levenberg-Marquardt learning applied for recurrent neural identification and control of a wastewater treatment bioprocess. Int. J. Intell.Syst. 24, 1094–1114 (2009). ISSN 0884-8173
MATH Google Scholar
Wan, E., Beaufays, F.: Diagrammatic method for deriving and relating temporal neural network algorithms. Neural Comput. 8, 182–201 (1996)
Article Google Scholar
Nava, F., Baruch, I.S., Poznyak, A., Nenkova, B.: Stability proofs of advanced recurrent neural networks topology and learning. Comptes Rendus, 57(1), 27–32 (2004). ISSN 0861-1459. (Proceedings of the Bulgarian Academy of Sciences)
Google Scholar
Baruch, I.S., Mariaca-Gaspar, C.R., Barrera-Cortes, J.: Recurrent neural network identification and adaptive neural control of hydrocarbon biodegradation processes. In: Hu, X., Balasubramaniam, P. (eds.) Recurrent Neural Networks, Chapter 4, pp. 61–88. I-Tech Education and Publishing KG, Vienna (2008). ISBN 978-953-7619-08-4
Google Scholar
Ngia, L.S., Sjöberg, J.: Efficient training of neural nets for nonlinear adaptive filtering using a recursive Levenberg-Marquardt algorithm. IEEE Trans. Signal Process. 48, 1915–1927 (2000)
Article MATH Google Scholar

Download references

Acknowledgments

The Ph.D. student Eloy Echeverria Saldierna is thankful to CONACYT, Mexico for the scholarship received during his studies in the Department of Automatic Control, CINVESTAV-IPN, Mexico City, MEXICO.

Author information

Authors and Affiliations

Department of Automatic Control, CINVESTAV-IPN, Ave. IPN No 2508, A.P. 14-470, 07360, Mexico, D.F., Mexico
Ieroham Baruch & Eloy Echeverria Saldierna

Authors

Ieroham Baruch
View author publications
You can also search for this author in PubMed Google Scholar
Eloy Echeverria Saldierna
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ieroham Baruch .

Editor information

Editors and Affiliations

Department of Automation and Applied Informatics, "Aurel Vlaicu" University of Arad, Arad, Romania
Valentina Emilia Balas
Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Sofia, Bulgaria
Petia Koprinkova-Hristova
Faculty of Education, Science, Technology and Mathematics, University of Canberra, Canberra, Australia
Lakhmi C. Jain

Appendices

Appendix 1: Detailed Derivation of the Recursive Levenberg-Marquardt Optimal Learning Algorithm for the RTNN

First of all we shall describe the optimal off-line learning method of Newton, then we shall modify it passing through the Gauss-Newton method and finally we shall simplify it so to obtain the off line Levenberg-Marquardt learning which finally will be transformed to recursive form (see [34] for more details).

The quadratic cost performance index under consideration is denoted by J _k(W), where W is the RTNN vector of weights with dimension N _w subject of iterative learning during the cost minimization. Let us assume that the performance index is an analytic function so all its derivatives exist.

Let us to expand J _k(W) around the optimal point of W(k) which yields:

$$\begin{aligned} J_{k} \left( {\text{W}} \right) \approx & \,J_{k} \left( {{\text{W}}\left( k \right)} \right) + \nabla J_{k}^{T} \left( {{\text{W}}\left( k \right)} \right)\left[ {{\text{W}} - {\text{W}}\left( k \right)} \right] \\ & \, + \frac{1}{2}\left[ {{\text{W}} - {\text{W}}\left( k \right)} \right]^{T} \nabla^{2} J_{k} \left( {{\text{W}}\left( k \right)} \right)\left[ {{\text{W}} - {\text{W}}\left( k \right)} \right] \\ \end{aligned}$$

(A.1)

where ∇J(W) is the gradient of J(W) with respect to the weight vector W:

$$\nabla J\left( {\text{W}} \right) = \left[ {\begin{array}{*{20}c} {\frac{\partial }{{\partial {\text{w}}_{1} }}J\left( {\text{W}} \right)} \\ {\frac{\partial }{{\partial {\text{w}}_{2} }}J\left( {\text{W}} \right)} \\ \vdots \\ {\frac{\partial }{{\partial {\text{w}}_{{N_{W} }} }}J\left( {\text{W}} \right)} \\ \end{array} } \right]$$

(A.2)

and ∇² J(W) is the Hessian matrix defined as:

$$\nabla^{2} J\left( {\text{W}} \right) = \left[ {\begin{array}{*{20}c} {\frac{{\partial^{2} }}{{\partial {\text{w}}_{1}^{2} }}J\left( {\text{W}} \right)} & {\frac{{\partial^{2} }}{{\partial {\text{w}}_{1} \partial {\text{w}}_{2} }}J\left( {\text{W}} \right)} & \cdots & {\frac{{\partial^{2} }}{{\partial {\text{w}}_{1} \partial {\text{w}}_{{N_{w} }} }}J\left( {\text{W}} \right)} \\ {\frac{{\partial^{2} }}{{\partial {\text{w}}_{2} \partial {\text{w}}_{1} }}J\left( {\text{W}} \right)} & {\frac{{\partial^{2} }}{{\partial {\text{w}}_{2}^{2} }}J\left( {\text{W}} \right)} & \cdots & {\frac{{\partial^{2} }}{{\partial {\text{w}}_{2} \partial {\text{w}}_{{N_{w} }} }}J\left( {\text{W}} \right)} \\ \vdots & \vdots & \ddots & \vdots \\ {\frac{{\partial^{2} }}{{\partial {\text{w}}_{{N_{w} }} \partial {\text{w}}_{1} }}J\left( {\text{W}} \right)} & {\frac{{\partial^{2} }}{{\partial {\text{w}}_{{N_{w} }} \partial {\text{w}}_{2} }}J\left( {\text{W}} \right)} & \cdots & {\frac{{\partial^{2} }}{{\partial {\text{w}}_{{N_{w} }}^{2} }}J\left( {\text{W}} \right)} \\ \end{array} } \right]$$

(A.3)

Taking the gradient of the Eq. (A.1) with respect to W and equating it to zero, we obtained:

$$\nabla J_{k} \left( {{\text{W}}\left( k \right)} \right) + \nabla^{2} J_{k} \left( {{\text{W}}\left( k \right)} \right)\left[ {{\text{W}} - {\text{W}}\left( k \right)} \right] = 0$$

(A.4)

Deriving (A.4) for W, we have:

$${\text{W}} = {\text{W}}\left( k \right) - \left( {\nabla^{2} J_{k} \left( {{\text{W}}\left( k \right)} \right)} \right)^{ - 1} \nabla J_{k} \left( {{\text{W}}\left( k \right)} \right)$$

(A.5)

Finally, we obtain the Newton’s learning algorithm as:

$${\text{W}}\left( {k + 1} \right) = {\text{W}}\left( k \right) - \left( {\nabla^{2} J_{k} \left( {{\text{W}}\left( k \right)} \right)} \right)^{ - 1} \nabla J_{k} \left( {{\text{W}}\left( k \right)} \right)$$

(A.6)

where W(k + 1) is the weight vector minimizing J _k(W) in the instant k; i.e. J _k(W)|_W=W(k+1) is min.

Let us suppose that W(k) is the weight vector that minimize J _k − 1(W) in the instant k–1, then:

$$\left. {\nabla J_{k - 1} \left( {\text{W}} \right)} \right|_{{{\text{W}} = {\text{W}}\left( k \right)}} = \nabla J_{k - 1} \left( {{\text{W}}\left( k \right)} \right) = 0$$

(A.7)

The performance index is defined as:

$$J_{k} \left( {\text{W}} \right) = \frac{1}{2}\sum\limits_{q = 1}^{k} {\alpha^{k - q} } {\text{E}}_{q}^{T} \left( {\text{W}} \right){\text{E}}_{q} \left( {\text{W}} \right)$$

(A.8)

$$J_{k} \left( {\text{W}} \right) = \frac{1}{2}\sum\limits_{q = 1}^{k} {\left( {\alpha^{k - q} \sum\limits_{j = 1}^{L} {e_{j,q}^{2} \left( {\text{W}} \right)} } \right)}$$

(A.9)

where: 0 < α ≤ 1 is a forgetting factor, q is the instant of the corresponding error vector, E_q represented the qth error vector, e _j,q is the jth element of E_q, k es is the final instant of the performance index. The ith element of the gradient is:

$$\left[ {\nabla J_{k} \left( {\text{W}} \right)} \right]_{i} \,=\, \frac{{\partial J_{k} \left( {\text{W}} \right)}}{{\partial {\text{w}}_{i} }} = \sum\limits_{q = 1}^{k} {\left( {\alpha^{k - q} \sum\limits_{j = 1}^{L} {e_{j,q} \left( {\text{W}} \right)\frac{{\partial e_{j,q} \left( {\text{W}} \right)}}{{\partial {\text{w}}_{i} }}} } \right)}$$

(A.10)

$$\begin{aligned} \left[ {\nabla J_{k} \left( {\text{W}} \right)} \right]_{i} \,=\, & \sum\limits_{q = 1}^{k} {\left( {\alpha^{k - q} \sum\limits_{j = 1}^{L} {e_{j,q} \left( {\text{W}} \right)\frac{{\partial \left( {r_{j,q} - y_{j,q} \left( {\text{W}} \right)} \right)}}{{\partial {\text{w}}_{i} }}} } \right)} \\ = & \, - \sum\limits_{q = 1}^{k} {\left( {\alpha^{k - q} \sum\limits_{j = 1}^{L} {e_{j,q} \left( {\text{W}} \right)\frac{{\partial y_{j,q} \left( {\text{W}} \right)}}{{\partial {\text{w}}_{i} }}} } \right)} \\ \end{aligned}$$

(A.11)

The matricial form of the performance index gradient is:

$$\nabla J_{k} \left( {\text{W}} \right) = - \sum\limits_{q = 1}^{k} {\alpha^{k - q} {\text{J}}_{{{\text{Y}}_{q} }}^{T} \left( {\text{W}} \right){\text{E}}_{q} \left( {\text{W}} \right)}$$

(A.12)

where the Jacobean matrix of Y_q in the instant q with dimension L × N _w is:

$${\text{J}}_{{{\text{Y}}_{q} }} \left( {\text{W}} \right) = \left[ {\begin{array}{*{20}c} {\frac{{\partial y_{1,q} }}{{\partial w_{1} }}} & {\frac{{\partial y_{1,q} }}{{\partial w_{2} }}} & \cdots & {\frac{{\partial y_{1,q} }}{{\partial w_{{N_{w} }} }}} \\ {\frac{{\partial y_{2,q} }}{{\partial w_{1} }}} & {\frac{{\partial y_{2,q} }}{{\partial w_{2} }}} & \cdots & {\frac{{\partial y_{2,q} }}{{\partial w_{{N_{w} }} }}} \\ \vdots & \vdots & \ddots & \vdots \\ {\frac{{\partial y_{L,q} }}{{\partial w_{1} }}} & {\frac{{\partial y_{L,q} }}{{\partial w_{2} }}} & \cdots & {\frac{{\partial y_{L,q} }}{{\partial w_{Nw} }}} \\ \end{array} } \right]$$

(A.13)

The gradient could be written in the following form:

$$\nabla J_{k} \left( {\text{W}} \right) = \alpha \nabla J_{k - 1} \left( {\text{W}} \right) - {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {\text{W}} \right){\text{E}}_{k} \left( {\text{W}} \right)$$

(A.14)

Then, the h, ith element of the Hessian matrix could be written as:

$$\begin{aligned} \left[ {\nabla^{2} J_{k} \left( {\text{W}} \right)} \right]_{h,i}\, =\, & \frac{{\partial^{2} J_{k} \left( {\text{W}} \right)}}{{\partial w_{h} \partial w_{i} }} \\ = & \sum\limits_{q = 1}^{k} {\alpha^{k - q} \sum\limits_{j = 1}^{L} {\left( {\frac{{\partial e_{j,q} \left( {\text{W}} \right)}}{{\partial w_{h} }}\frac{{\partial e_{j,q} \left( {\text{W}} \right)}}{{\partial w_{i} }} + e_{j,q} \left( {\text{W}} \right)\frac{{\partial^{2} e_{j,q} \left( {\text{W}} \right)}}{{\partial w_{h} \partial w_{i} }}} \right)} } \\ \end{aligned}$$

(A.15)

$$\left[ {\nabla^{2} J_{k} \left( {\text{W}} \right)} \right]_{h,i} \,=\, \sum\limits_{q = 1}^{k} {\alpha^{k - q} \sum\limits_{j = 1}^{L} {\left( {\frac{{\partial e_{j,q} \left( {\text{W}} \right)}}{{\partial w_{h} }}\frac{{\partial e_{j,q} \left( {\text{W}} \right)}}{{\partial w_{i} }} + e_{j,q} \left( {\text{W}} \right)\frac{{\partial^{2} e_{j,q} \left( {\text{W}} \right)}}{{\partial w_{h} \partial w_{i} }}} \right)} }$$

(A.16)

$$\nabla^{2} J_{k} \left( {\text{W}} \right) = \sum\limits_{q = 1}^{k} {\alpha^{k - q} \left( {{\text{J}}_{{{\text{Y}}_{q} }}^{T} \left( {\text{W}} \right){\text{J}}_{{{\text{Y}}_{q} }} \left( {\text{W}} \right) + \sum\limits_{j = 1}^{L} {e_{j,q} \left( {\text{W}} \right)\nabla^{2} e_{j,q} \left( {\text{W}} \right)} } \right)}$$

(A.17)

Equating:

$$\sum\limits_{j = 1}^{L} {e_{j,q} \left( {\text{W}} \right)\nabla^{2} e_{j,q} \left( {\text{W}} \right)} \approx 0$$

(A.18)

we could obtain directly the Gauss-Newton method of optimal learning.

The Eq. (A.17) is reduced to:

$$\nabla^{2} J_{k} \left( {\text{W}} \right) = \sum\limits_{q = 1}^{k} {\alpha^{k - q} {\text{J}}_{{{\text{Y}}_{q} }}^{T} \left( {\text{W}} \right){\text{J}}_{{{\text{Y}}_{q} }} \left( {\text{W}} \right)}$$

(A.19)

and it could also be written as:

$$\nabla^{2} J_{k} \left( {\text{W}} \right) = \alpha \nabla^{2} J_{k - 1} \left( {\text{W}} \right) + {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {\text{W}} \right){\text{J}}_{{{\text{Y}}_{k} }} \left( {\text{W}} \right)$$

(A.20)

Then (A.14) and (A.20) solved for W = W (k) are transformed to:

$$\nabla J_{k} \left( {{\text{W}}\left( k \right)} \right) = \alpha \nabla J_{k - 1} \left( {{\text{W}}\left( k \right)} \right) - {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {{\text{W}}\left( k \right)} \right){\text{E}}_{k} \left( {{\text{W}}\left( k \right)} \right)$$

(A.21)

$$\nabla^{2} J_{k} \left( {{\text{W}}\left( k \right)} \right) = \alpha \nabla^{2} J_{k - 1} \left( {{\text{W}}\left( k \right)} \right) + {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {{\text{W}}\left( k \right)} \right){\text{J}}_{{{\text{Y}}_{k} }} \left( {{\text{W}}\left( k \right)} \right)$$

(A.22)

According to (A.7), the Eq. (A.21) is reduced to:

$$\nabla J_{k} \left( {{\text{W}}\left( k \right)} \right) = - {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {{\text{W}}\left( k \right)} \right){\text{E}}_{k} \left( {{\text{W}}\left( k \right)} \right)$$

(A.23)

Let us define:

$${\text{H}}\left( k \right) = \nabla^{2} J_{k} \left( {{\text{W}}\left( k \right)} \right)$$

(A.24)

Then we could write the Eq. (A.22) in the following form:

$${\text{H}}\left( k \right) = \alpha {\text{H}}\left( {k - 1} \right) + {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {{\text{W}}\left( k \right)} \right){\text{J}}_{{{\text{Y}}_{k} }} \left( {{\text{W}}\left( k \right)} \right)$$

(A.25)

Finally for the learning algorithm (A.6), we could obtain:

$${\text{W}}\left( {k + 1} \right) = {\text{W}}\left( k \right) + \left( {\nabla^{2} J_{k} \left( {{\text{W}}\left( k \right)} \right)} \right)^{ - 1} {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {{\text{W}}\left( k \right)} \right){\text{E}}_{k} \left( {{\text{W}}\left( k \right)} \right)$$

(A.26)

$${\text{W}}\left( {k + 1} \right) = {\text{W}}\left( k \right) + {\text{H}}^{ - 1} \left( k \right){\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {{\text{W}}\left( k \right)} \right){\text{E}}_{k} \left( {{\text{W}}\left( k \right)} \right)$$

(A.27)

The Eq. (A.27) corresponds to the Gauss-Newton learning method where the considered Hessian matrix is an approximation to the real one.

Let us now to come back to Eq. (A.19).

$${\text{H}}\left( k \right) = \nabla^{2} J_{k} \left( {\text{W}} \right) = \sum\limits_{q = 1}^{k} {\alpha^{k - q} {\text{J}}_{{{\text{Y}}_{q} }}^{T} \left( {\text{W}} \right){\text{J}}_{{{\text{Y}}_{q} }} \left( {\text{W}} \right)}$$

(A.28)

Here we observe that the product J^T J could be nonsingular which require to perform the following modification of the Hessian matrix:

$${\text{H}}\left( k \right) = {\mathbf{\nabla }}^{2} J_{k} \left( {\text{W}} \right) = \sum\limits_{q = 1}^{k} {\alpha^{k - q} \left( {{\text{J}}_{{{\text{Y}}_{q} }}^{T} \left( {\text{W}} \right){\text{J}}_{{{\text{Y}}_{q} }} \left( {\text{W}} \right) + \rho {\text{I}}} \right)}$$

(A.29)

The Hessian matrix could be written also in the form:

$${\text{H}}\left( k \right) = \alpha {\text{H}}\left( {k - 1} \right) + {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {\text{W}} \right){\text{J}}_{{{\text{Y}}_{k} }} \left( {\text{W}} \right) + \rho {\text{I}}$$

(A.30)

where ρ is a small constant (generally ρ is chosen between 10⁻² and 10⁻⁴).

This modification of the Hessian matrix is essential for the optimal learning method of Levenberg-Marquardt. The computation of the Hessian matrix inverse could be done using the matrix inversion lemma which requires the following modification:

$${\text{H}}\left( k \right) = \alpha {\text{H}}\left( {k - 1} \right) + {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {\text{W}} \right){\text{J}}_{{{\text{Y}}_{k} }} \left( {\text{W}} \right) + \rho {\text{I}}_{{N_{w} }}$$

(A.31)

where ${\text{I}}_{{N_{w} }}$ is a N _w × N _w zero matrix except one element (with value 1) corresponding to position i = (k mod N _w) +1. It could be seen that after N _w iterations, the Eq. (A.31) become equal to Eq. (A.30), i.e.:

$$\sum\limits_{n = k + 1}^{{k + N_{w} }} {\rho {\text{I}}_{{N_{w} }} } \left( n \right) = \rho {\text{I}}$$

(A.32)

Then the Eq. (A.31) is transformed to:

$${\text{H}}\left( k \right) = \alpha {\text{H}}\left( {k - 1} \right) +\Omega ^{T} \left( k \right)\Lambda ^{ - 1} \left( k \right)\Omega \left( k \right)$$

(A.33)

where:

$$\Omega \left( k \right) = \left[ {\begin{array}{*{20}c} {{\text{J}}_{{{\text{Y}}_{k} }} \left( {\text{W}} \right)} \\ {\begin{array}{*{20}c} 0 & \cdots & 0 & 1 & 0 & \cdots & 0 \\ \end{array} } \\ \end{array} } \right]$$

(A.34)

$$\Lambda \left( k \right)^{ - 1} = \left[ {\begin{array}{*{20}c} {\text{I}} & 0 \\ 0 & \rho \\ \end{array} } \right]$$

(A.35)

Then it is easy to apply the matrix inversion lemma, which constitute to the following equation (where the matrices A, B, C and D have compatible dimensions and the product BCD, and the sum A + BCD exists):

$$\left[ {{\text{A}} + {\text{BCD}}} \right]^{ - 1} = {\text{A}}^{ - 1} - {\text{A}}^{ - 1} {\text{B}}\left[ {{\text{DA}}^{ - 1} {\text{B}} + {\text{C}}^{ - 1} } \right]^{ - 1} {\text{DA}}^{ - 1}$$

(A.36)

Let us to apply the following substitutions:

$${\text{A}} = \alpha {\text{H}}\left( {k - 1} \right);{\text{B}} =\Omega ^{T} \left( k \right);{\text{C}} =\Lambda ^{ - 1} \left( k \right);{\text{D}} =\Omega \left( k \right)$$

The inverse of the Hessian matrix H(k) could be computed using the expression:

$$\begin{aligned} {\text{H}}^{ - 1} \left( k \right) = & \,\left[ {\alpha {\text{H}}\left( {k - 1} \right) +\Omega ^{T} \left( k \right)\Lambda ^{ - 1} \left( k \right)\Omega \left( k \right)} \right]^{ - 1} = \alpha^{ - 1} {\text{H}}^{ - 1} \left( {k - 1} \right) \\ & \, - \alpha^{ - 1} {\text{H}}^{ - 1} \left( {k - 1} \right)\Omega ^{T} \left( k \right)\left[ {\Omega \left( k \right)\alpha^{ - 1} {\text{H}}^{ - 1} \left( {k - 1} \right)\Omega ^{T} \left( k \right) +\Lambda \left( k \right)} \right]^{ - 1} \\ & \,\Omega \left( k \right)\alpha^{ - 1} {\text{H}}^{ - 1} \left( {k - 1} \right) \\ \end{aligned}$$

(A.37)

$$\begin{aligned} {\text{H}}^{ - 1} \left( k \right) = & \,\alpha^{ - 1} \left\{ {{\text{H}}^{ - 1} \left( {k - 1} \right)} \right. \\ & \, - {\text{H}}^{ - 1} \left( {k - 1} \right)\Omega ^{T} \left( k \right)\left[ {\Omega \left( k \right){\text{H}}^{ - 1} \left( {k - 1} \right)\Omega ^{T} \left( k \right) + \alpha\Lambda \left( k \right)} \right]^{ - 1} \\ & \,\Omega \left( k \right)\left. {{\text{H}}^{ - 1} \left( {k - 1} \right)} \right\} \\ \end{aligned}$$

(A.38)

Let us denote

$${\text{P}}\left( k \right) = {\text{H}}^{ - 1} \left( k \right)$$

and substitute it in the Eq. (A.38), we obtained:

$${\text{P}}\left( k \right) = \alpha^{ - 1} \left\{ {{\text{P}}\left( {k - 1} \right)} \right. - {\text{P}}\left( {k - 1} \right)\Omega ^{T} \left( k \right){\text{S}}^{ - 1} \left( k \right)\Omega \left( k \right)\left. {{\text{P}}\left( {k - 1} \right)} \right\}$$

(A.39)

where:

$${\text{S}}\left( k \right) = \alpha\Lambda \left( k \right) +\Omega \left( k \right){\text{P}}\left( {k - 1} \right)\Omega ^{T} \left( k \right)$$

(A.40)

Finally, the learning algorithm for w is obtained as:

$${\text{W}}\left( {k + 1} \right) = {\text{W}}\left( k \right) + {\text{P}}\left( k \right){\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {{\text{W}}\left( k \right)} \right){\text{E}}_{k} \left( {{\text{W}}\left( k \right)} \right)$$

(A.41)

where W is a N _w × 1 vector formed of all RTNN weights (N _w = L × N + N + N × M).

Using the RTNN topology the weight vector has the following form:

$${\text{W}}\left( k \right) = \left[ {\begin{array}{*{20}c} {c_{1,1} } & \cdots & {c_{L,N} } & {a_{1,1} } & {a_{2,2} } & \cdots & {a_{N,N} } & {b_{1,1} } & \cdots & {b_{N,M} } \\ \end{array} } \right]^{T}$$

(A.42)

and the Jacobean matrix with dimension L × N _w. is formed as:

$${\text{J}}_{{{\text{Y}}_{k} }} \left( {{\text{W}}\left( k \right)} \right) = \left[ {\begin{array}{*{20}c} {{\text{J}}_{{{\text{Y}}_{k} }} \left( {{\text{C}}\left( k \right)} \right)} & {{\text{J}}_{{{\text{Y}}_{k} }} \left( {{\text{A}}\left( k \right)} \right)} & {{\text{J}}_{{{\text{Y}}_{k} }} \left( {{\text{B}}\left( k \right)} \right)} \\ \end{array} } \right]$$

(A.43)

The components of the Jacobean matrix could be obtained applying the diagrammatic method [31]. Using the notation of part 2.2 for (A.43), we could write:

$${\text{DY}}\left[ {{\text{W}}\left( {\text{k}} \right)} \right] = \left[ {{\text{DY}}\left( {{\text{C}}_{\text{ij}} \left( {\text{k}} \right)} \right),{\text{DY}}\left( {{\text{A}}_{\text{ij}} \left( {\text{k}} \right)} \right),{\text{DY}}\left( {{\text{B}}_{\text{ij}} \left( {\text{k}} \right)} \right)} \right].$$

Appendix 2

Table A1 Abbreviations used in the chapter

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Baruch, I., Saldierna, E.E. (2014). Decentralized Fuzzy-Neural Identification and I-Term Adaptive Control of Distributed Parameter Bioprocess Plant. In: Balas, V., Koprinkova-Hristova, P., Jain, L. (eds) Innovations in Intelligent Machines-5. Studies in Computational Intelligence, vol 561. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43370-6_1

Download citation

DOI: https://doi.org/10.1007/978-3-662-43370-6_1
Published: 23 May 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43369-0
Online ISBN: 978-3-662-43370-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics