Skip to main content

Decentralized Fuzzy-Neural Identification and I-Term Adaptive Control of Distributed Parameter Bioprocess Plant

  • Chapter
  • First Online:
Book cover Innovations in Intelligent Machines-5

Part of the book series: Studies in Computational Intelligence ((SCI,volume 561))

  • 740 Accesses

Abstract

The chapter proposed to use of a Recurrent Neural Network Model (RNNM) incorporated in a fuzzy-neural multi model for decentralized identification of an aerobic digestion process, carried out in a fixed bed and a recirculation tank anaerobic wastewater treatment system. The analytical model of the digestion bioprocess represented a distributed parameter system, which is reduced to a lumped system using the orthogonal collocation method, applied in four collocation points. The proposed decentralized RNNM consists of five independently working Recurrent Neural Networks (RNN), so to approximate the process dynamics in four different measurement points plus the recirculation tank. The RNN learning algorithm is the second order Levenberg-Marquardt one. The comparative graphical simulation results of the digestion wastewater treatment system approximation, obtained via decentralized RNN learning, exhibited a good convergence, and precise plant variables tracking. The identification results are used for I-term direct and indirect (sliding mode) control obtaining good results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn, Section 2.13, 84–89; Section 4.13, pp. 208–213. Prentice-Hall, Upper Saddle River (1999)

    Google Scholar 

  2. Narendra, K.S., Parthasarathy, K.: Identification and control of dynamical systems using neural networks. IEEE Trans. Neural Networks 1(1), 4–27 (1990)

    Article  Google Scholar 

  3. Chen, S., Billings, S.A.: Neural networks for nonlinear dynamics system modelling and identification. Int. J. Control 56(2), 319–346 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  4. Hunt, K.J., Sbarbaro, D., Zbikowski, R., Gawthrop, P.J.: Neural network for control systems (A survey). Automatica 28, 1083–1112 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  5. Miller III, W.T., Sutton, R.S., Werbos, P.J.: Neural Networks for Control. MIT Press, London (1992)

    Google Scholar 

  6. Pao, S.A., Phillips, S.M., Sobajic, D.J.: Neural net computing and intelligent control systems. Int. J. Control 56(3), 263–289 (1992). (Special issue on Intelligent Control)

    Article  MATH  MathSciNet  Google Scholar 

  7. Su, H.-T., McAvoy, T.J., Werbos, P.: Long-term predictions of chemical processes using recurrent neural networks: a parallel training approach. Ind. Eng. Chem. Res. 31(5), 1338–1352 (1992)

    Google Scholar 

  8. Boskovic, J.D., Narendra, K.S.: Comparison of linear, nonlinear and neural-network-based adaptive controllers for a class of fed-batch fermentation processes. Automatica 31, 817–840 (1995)

    Article  MathSciNet  Google Scholar 

  9. Omatu, S., Khalil, M., Yusof, R.: Neuro-Control and Its Applications. Springer, London (1995)

    Google Scholar 

  10. Baruch, I.S., Garrido, R.: A direct adaptive neural control scheme with integral terms. Int. J. Intell. Syst. 20(2), 213–224 (2005). ISSN 0884-8173. (Special issue on Soft Computing for Modelling, Simulation and Control of Nonlinear Dynamical Systems, Castillo, O., Melin, P. guest ed. Wiley Inter-Science)

    Google Scholar 

  11. Bulsari, A., Palosaari, S.: Application of neural networks for system identification of an adsorption column. Neural Comput. Appl. 1, 160–165 (1993)

    Article  Google Scholar 

  12. Deng, H., Li, H.X.: Hybrid intelligence based modelling for nonlinear distributed parameter process with applications to the curing process. IEEE Trans. Syst. Man Cybern. 4, 3506–3511 (2003)

    Google Scholar 

  13. Deng, H., Li, H.X.: Spectral-approximation-based intelligent modelling for distributed thermal processes. IEEE Trans. Control Syst. Technol 13, 686–700 (2005)

    Article  Google Scholar 

  14. Gonzalez-Garcia, R., Rico-Martinez, R., Kevrekidis, I.: Identification of distributed parameter systems: a neural net based approach. Comput. Chem. Eng. 22(4-supl. 1), 965–968 (1998)

    Google Scholar 

  15. Padhi, R., Balakrishnan, S., Randolph, T.: Adaptive critic based optimal neuro- control synthesis for distributed parameter systems. Automatica 37, 1223–1234 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  16. Padhi, R., Balakrishnan, S.: Proper orthogonal decomposition based optimal neuro-control synthesis of a chemical reactor process using approximate dynamic programming. Neural Networks 16, 719–728 (2003)

    Article  Google Scholar 

  17. Pietil, S., Koivo, H.N.: Centralized and decentralized neural network models for distributed parameter systems. In: Proceedings of the Symposium on Control, Optimization and Supervision, CESA’96, IMACS Multiconference on Computational Engineering in Systems Applications, Lille, France, pp. 1043–1048 (1996)

    Google Scholar 

  18. Lin, C.T., Lee, C.S.G.: Neural Fuzzy Systems: A Neuro—Fuzzy Synergism to Intelligent Systems. Prentice Hall, Englewood Cliffs (1996)

    Google Scholar 

  19. Babuska, R.: Fuzzy Modeling for Control. Kluwer, Norwell (1998)

    Book  Google Scholar 

  20. Baruch, I., Beltran-Lopez, R., Olivares-Guzman, J.L., Flores, J.M.: A fuzzy-neural multi-model for nonlinear systems identification and control. Fuzzy Sets Syst. 159, 2650–2667 (2008)

    Article  MATH  Google Scholar 

  21. Takagi, T., Sugeno, M.: Fuzzy identification of systems and its applications to modelling and control. IEEE Trans. Syst. Man Cybern. 15, 116–132 (1985)

    Article  MATH  Google Scholar 

  22. Teixeira, M., Zak, S.: Stabilizing controller design for uncertain nonlinear systems, using fuzzy models. IEEE Trans. Syst. Man Cybern. 7, 133–142 (1999)

    Google Scholar 

  23. Mastorocostas, P.A., Theocharis, J.B.: A recurrent fuzzy-neural model for dynamic system identification. IEEE Trans. Syst. Man Cybern. B Cybern. 32, 176–190 (2002)

    Article  Google Scholar 

  24. Mastorocostas, P.A., Theocharis, J.B.: An orthogonal least-squares method for recurrent fuzzy-neural modeling. Fuzzy Sets Syst. 140(2), 285–300 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  25. Galvan-Guerra, R., Baruch, I.S.: Anaerobic digestion process identification using recurrent neural multi-model. In: Gelbukh, A., Kuri-Morales, A.F. (eds.) Sixth Mexican International Conference on Artificial Intelligence, 4–10 Nov 2007, Aguascalientes, Mexico, Special Session, Revised Papers, CPS, pp. 319–329. IEEE Computer Society, Los Alamos. ISBN 978-0-7695-3124-3 (2008)

    Google Scholar 

  26. Baruch, I., Olivares-Guzman, J.L., Mariaca-Gaspar, C.R., Galvan-Guerra, R.: A sliding mode control using fuzzy-neural hierarchical multi-model identifier. In: Castillo, O., Melin, P., Ross, O.M., Cruz, R.S., Pedrycz, W., Kacprzyk, J. (eds.) Theoretical Advances and Applications of Fuzzy Logic and Soft Computing, ASC, vol. 42, pp. 762–771. Springer, Berlin (2007)

    Google Scholar 

  27. Baruch, I., Olivares-Guzman, J.L., Mariaca-Gaspar, C.R., Galvan-Guerra, R.: A fuzzy-neural hierarchical multi-model for systems identification and direct adaptive control. In: Melin, P., Castillo, O., Ramirez, E.G., Kacprzyk, J., Pedrycz, W. (eds.) Analysis and Design of Intelligent Systems Using Soft Computing Techniques, ASC, vol. 41, pp. 163–172. Springer, Berlin (2007)

    Google Scholar 

  28. Aguilar-Garnica, F., Alcaraz-Gonzalez, V., Gonzalez-Alvarez, V.: Interval observer design for an anaerobic digestion process described by a distributed parameter model. In: Proceedings of the 2nd International Meeting on Environmental Biotechnology and Engineering (2IMEBE), CINVESTAV-IPN, Mexico City, paper 117 (2006), pp. 1–16

    Google Scholar 

  29. Bialecki, B., Fairweather, G.: Orthogonal spline collocation methods for partial differential equations. J. Comput. Appl. Math. 128, 55–82 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  30. Baruch, I.S., Mariaca-Gaspar, C.R.: A Levenberg-Marquardt learning applied for recurrent neural identification and control of a wastewater treatment bioprocess. Int. J. Intell.Syst. 24, 1094–1114 (2009). ISSN 0884-8173

    MATH  Google Scholar 

  31. Wan, E., Beaufays, F.: Diagrammatic method for deriving and relating temporal neural network algorithms. Neural Comput. 8, 182–201 (1996)

    Article  Google Scholar 

  32. Nava, F., Baruch, I.S., Poznyak, A., Nenkova, B.: Stability proofs of advanced recurrent neural networks topology and learning. Comptes Rendus, 57(1), 27–32 (2004). ISSN 0861-1459. (Proceedings of the Bulgarian Academy of Sciences)

    Google Scholar 

  33. Baruch, I.S., Mariaca-Gaspar, C.R., Barrera-Cortes, J.: Recurrent neural network identification and adaptive neural control of hydrocarbon biodegradation processes. In: Hu, X., Balasubramaniam, P. (eds.) Recurrent Neural Networks, Chapter 4, pp. 61–88. I-Tech Education and Publishing KG, Vienna (2008). ISBN 978-953-7619-08-4

    Google Scholar 

  34. Ngia, L.S., Sjöberg, J.: Efficient training of neural nets for nonlinear adaptive filtering using a recursive Levenberg-Marquardt algorithm. IEEE Trans. Signal Process. 48, 1915–1927 (2000)

    Article  MATH  Google Scholar 

Download references

Acknowledgments

The Ph.D. student Eloy Echeverria Saldierna is thankful to CONACYT, Mexico for the scholarship received during his studies in the Department of Automatic Control, CINVESTAV-IPN, Mexico City, MEXICO.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ieroham Baruch .

Editor information

Editors and Affiliations

Appendices

Appendix 1: Detailed Derivation of the Recursive Levenberg-Marquardt Optimal Learning Algorithm for the RTNN

First of all we shall describe the optimal off-line learning method of Newton, then we shall modify it passing through the Gauss-Newton method and finally we shall simplify it so to obtain the off line Levenberg-Marquardt learning which finally will be transformed to recursive form (see [34] for more details).

The quadratic cost performance index under consideration is denoted by J k (W), where W is the RTNN vector of weights with dimension N w subject of iterative learning during the cost minimization. Let us assume that the performance index is an analytic function so all its derivatives exist.

Let us to expand J k (W) around the optimal point of W(k) which yields:

$$\begin{aligned} J_{k} \left( {\text{W}} \right) \approx & \,J_{k} \left( {{\text{W}}\left( k \right)} \right) + \nabla J_{k}^{T} \left( {{\text{W}}\left( k \right)} \right)\left[ {{\text{W}} - {\text{W}}\left( k \right)} \right] \\ & \, + \frac{1}{2}\left[ {{\text{W}} - {\text{W}}\left( k \right)} \right]^{T} \nabla^{2} J_{k} \left( {{\text{W}}\left( k \right)} \right)\left[ {{\text{W}} - {\text{W}}\left( k \right)} \right] \\ \end{aligned}$$
(A.1)

where ∇J(W) is the gradient of J(W) with respect to the weight vector W:

$$\nabla J\left( {\text{W}} \right) = \left[ {\begin{array}{*{20}c} {\frac{\partial }{{\partial {\text{w}}_{1} }}J\left( {\text{W}} \right)} \\ {\frac{\partial }{{\partial {\text{w}}_{2} }}J\left( {\text{W}} \right)} \\ \vdots \\ {\frac{\partial }{{\partial {\text{w}}_{{N_{W} }} }}J\left( {\text{W}} \right)} \\ \end{array} } \right]$$
(A.2)

and ∇2 J(W) is the Hessian matrix defined as:

$$\nabla^{2} J\left( {\text{W}} \right) = \left[ {\begin{array}{*{20}c} {\frac{{\partial^{2} }}{{\partial {\text{w}}_{1}^{2} }}J\left( {\text{W}} \right)} & {\frac{{\partial^{2} }}{{\partial {\text{w}}_{1} \partial {\text{w}}_{2} }}J\left( {\text{W}} \right)} & \cdots & {\frac{{\partial^{2} }}{{\partial {\text{w}}_{1} \partial {\text{w}}_{{N_{w} }} }}J\left( {\text{W}} \right)} \\ {\frac{{\partial^{2} }}{{\partial {\text{w}}_{2} \partial {\text{w}}_{1} }}J\left( {\text{W}} \right)} & {\frac{{\partial^{2} }}{{\partial {\text{w}}_{2}^{2} }}J\left( {\text{W}} \right)} & \cdots & {\frac{{\partial^{2} }}{{\partial {\text{w}}_{2} \partial {\text{w}}_{{N_{w} }} }}J\left( {\text{W}} \right)} \\ \vdots & \vdots & \ddots & \vdots \\ {\frac{{\partial^{2} }}{{\partial {\text{w}}_{{N_{w} }} \partial {\text{w}}_{1} }}J\left( {\text{W}} \right)} & {\frac{{\partial^{2} }}{{\partial {\text{w}}_{{N_{w} }} \partial {\text{w}}_{2} }}J\left( {\text{W}} \right)} & \cdots & {\frac{{\partial^{2} }}{{\partial {\text{w}}_{{N_{w} }}^{2} }}J\left( {\text{W}} \right)} \\ \end{array} } \right]$$
(A.3)

Taking the gradient of the Eq. (A.1) with respect to W and equating it to zero, we obtained:

$$\nabla J_{k} \left( {{\text{W}}\left( k \right)} \right) + \nabla^{2} J_{k} \left( {{\text{W}}\left( k \right)} \right)\left[ {{\text{W}} - {\text{W}}\left( k \right)} \right] = 0$$
(A.4)

Deriving (A.4) for W, we have:

$${\text{W}} = {\text{W}}\left( k \right) - \left( {\nabla^{2} J_{k} \left( {{\text{W}}\left( k \right)} \right)} \right)^{ - 1} \nabla J_{k} \left( {{\text{W}}\left( k \right)} \right)$$
(A.5)

Finally, we obtain the Newton’s learning algorithm as:

$${\text{W}}\left( {k + 1} \right) = {\text{W}}\left( k \right) - \left( {\nabla^{2} J_{k} \left( {{\text{W}}\left( k \right)} \right)} \right)^{ - 1} \nabla J_{k} \left( {{\text{W}}\left( k \right)} \right)$$
(A.6)

where W(k + 1) is the weight vector minimizing J k (W) in the instant k; i.e. J k (W)|W=W(k+1) is min.

Let us suppose that W(k) is the weight vector that minimize J k − 1(W) in the instant k–1, then:

$$\left. {\nabla J_{k - 1} \left( {\text{W}} \right)} \right|_{{{\text{W}} = {\text{W}}\left( k \right)}} = \nabla J_{k - 1} \left( {{\text{W}}\left( k \right)} \right) = 0$$
(A.7)

The performance index is defined as:

$$J_{k} \left( {\text{W}} \right) = \frac{1}{2}\sum\limits_{q = 1}^{k} {\alpha^{k - q} } {\text{E}}_{q}^{T} \left( {\text{W}} \right){\text{E}}_{q} \left( {\text{W}} \right)$$
(A.8)
$$J_{k} \left( {\text{W}} \right) = \frac{1}{2}\sum\limits_{q = 1}^{k} {\left( {\alpha^{k - q} \sum\limits_{j = 1}^{L} {e_{j,q}^{2} \left( {\text{W}} \right)} } \right)}$$
(A.9)

where: 0 < α ≤ 1 is a forgetting factor, q is the instant of the corresponding error vector, E q represented the qth error vector, e j,q is the jth element of E q , k es is the final instant of the performance index. The ith element of the gradient is:

$$\left[ {\nabla J_{k} \left( {\text{W}} \right)} \right]_{i} \,=\, \frac{{\partial J_{k} \left( {\text{W}} \right)}}{{\partial {\text{w}}_{i} }} = \sum\limits_{q = 1}^{k} {\left( {\alpha^{k - q} \sum\limits_{j = 1}^{L} {e_{j,q} \left( {\text{W}} \right)\frac{{\partial e_{j,q} \left( {\text{W}} \right)}}{{\partial {\text{w}}_{i} }}} } \right)}$$
(A.10)
$$\begin{aligned} \left[ {\nabla J_{k} \left( {\text{W}} \right)} \right]_{i} \,=\, & \sum\limits_{q = 1}^{k} {\left( {\alpha^{k - q} \sum\limits_{j = 1}^{L} {e_{j,q} \left( {\text{W}} \right)\frac{{\partial \left( {r_{j,q} - y_{j,q} \left( {\text{W}} \right)} \right)}}{{\partial {\text{w}}_{i} }}} } \right)} \\ = & \, - \sum\limits_{q = 1}^{k} {\left( {\alpha^{k - q} \sum\limits_{j = 1}^{L} {e_{j,q} \left( {\text{W}} \right)\frac{{\partial y_{j,q} \left( {\text{W}} \right)}}{{\partial {\text{w}}_{i} }}} } \right)} \\ \end{aligned}$$
(A.11)

The matricial form of the performance index gradient is:

$$\nabla J_{k} \left( {\text{W}} \right) = - \sum\limits_{q = 1}^{k} {\alpha^{k - q} {\text{J}}_{{{\text{Y}}_{q} }}^{T} \left( {\text{W}} \right){\text{E}}_{q} \left( {\text{W}} \right)}$$
(A.12)

where the Jacobean matrix of Y q in the instant q with dimension L × N w is:

$${\text{J}}_{{{\text{Y}}_{q} }} \left( {\text{W}} \right) = \left[ {\begin{array}{*{20}c} {\frac{{\partial y_{1,q} }}{{\partial w_{1} }}} & {\frac{{\partial y_{1,q} }}{{\partial w_{2} }}} & \cdots & {\frac{{\partial y_{1,q} }}{{\partial w_{{N_{w} }} }}} \\ {\frac{{\partial y_{2,q} }}{{\partial w_{1} }}} & {\frac{{\partial y_{2,q} }}{{\partial w_{2} }}} & \cdots & {\frac{{\partial y_{2,q} }}{{\partial w_{{N_{w} }} }}} \\ \vdots & \vdots & \ddots & \vdots \\ {\frac{{\partial y_{L,q} }}{{\partial w_{1} }}} & {\frac{{\partial y_{L,q} }}{{\partial w_{2} }}} & \cdots & {\frac{{\partial y_{L,q} }}{{\partial w_{Nw} }}} \\ \end{array} } \right]$$
(A.13)

The gradient could be written in the following form:

$$\nabla J_{k} \left( {\text{W}} \right) = \alpha \nabla J_{k - 1} \left( {\text{W}} \right) - {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {\text{W}} \right){\text{E}}_{k} \left( {\text{W}} \right)$$
(A.14)

Then, the h, ith element of the Hessian matrix could be written as:

$$\begin{aligned} \left[ {\nabla^{2} J_{k} \left( {\text{W}} \right)} \right]_{h,i}\, =\, & \frac{{\partial^{2} J_{k} \left( {\text{W}} \right)}}{{\partial w_{h} \partial w_{i} }} \\ = & \sum\limits_{q = 1}^{k} {\alpha^{k - q} \sum\limits_{j = 1}^{L} {\left( {\frac{{\partial e_{j,q} \left( {\text{W}} \right)}}{{\partial w_{h} }}\frac{{\partial e_{j,q} \left( {\text{W}} \right)}}{{\partial w_{i} }} + e_{j,q} \left( {\text{W}} \right)\frac{{\partial^{2} e_{j,q} \left( {\text{W}} \right)}}{{\partial w_{h} \partial w_{i} }}} \right)} } \\ \end{aligned}$$
(A.15)
$$\left[ {\nabla^{2} J_{k} \left( {\text{W}} \right)} \right]_{h,i} \,=\, \sum\limits_{q = 1}^{k} {\alpha^{k - q} \sum\limits_{j = 1}^{L} {\left( {\frac{{\partial e_{j,q} \left( {\text{W}} \right)}}{{\partial w_{h} }}\frac{{\partial e_{j,q} \left( {\text{W}} \right)}}{{\partial w_{i} }} + e_{j,q} \left( {\text{W}} \right)\frac{{\partial^{2} e_{j,q} \left( {\text{W}} \right)}}{{\partial w_{h} \partial w_{i} }}} \right)} }$$
(A.16)
$$\nabla^{2} J_{k} \left( {\text{W}} \right) = \sum\limits_{q = 1}^{k} {\alpha^{k - q} \left( {{\text{J}}_{{{\text{Y}}_{q} }}^{T} \left( {\text{W}} \right){\text{J}}_{{{\text{Y}}_{q} }} \left( {\text{W}} \right) + \sum\limits_{j = 1}^{L} {e_{j,q} \left( {\text{W}} \right)\nabla^{2} e_{j,q} \left( {\text{W}} \right)} } \right)}$$
(A.17)

Equating:

$$\sum\limits_{j = 1}^{L} {e_{j,q} \left( {\text{W}} \right)\nabla^{2} e_{j,q} \left( {\text{W}} \right)} \approx 0$$
(A.18)

we could obtain directly the Gauss-Newton method of optimal learning.

The Eq. (A.17) is reduced to:

$$\nabla^{2} J_{k} \left( {\text{W}} \right) = \sum\limits_{q = 1}^{k} {\alpha^{k - q} {\text{J}}_{{{\text{Y}}_{q} }}^{T} \left( {\text{W}} \right){\text{J}}_{{{\text{Y}}_{q} }} \left( {\text{W}} \right)}$$
(A.19)

and it could also be written as:

$$\nabla^{2} J_{k} \left( {\text{W}} \right) = \alpha \nabla^{2} J_{k - 1} \left( {\text{W}} \right) + {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {\text{W}} \right){\text{J}}_{{{\text{Y}}_{k} }} \left( {\text{W}} \right)$$
(A.20)

Then (A.14) and (A.20) solved for W = W (k) are transformed to:

$$\nabla J_{k} \left( {{\text{W}}\left( k \right)} \right) = \alpha \nabla J_{k - 1} \left( {{\text{W}}\left( k \right)} \right) - {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {{\text{W}}\left( k \right)} \right){\text{E}}_{k} \left( {{\text{W}}\left( k \right)} \right)$$
(A.21)
$$\nabla^{2} J_{k} \left( {{\text{W}}\left( k \right)} \right) = \alpha \nabla^{2} J_{k - 1} \left( {{\text{W}}\left( k \right)} \right) + {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {{\text{W}}\left( k \right)} \right){\text{J}}_{{{\text{Y}}_{k} }} \left( {{\text{W}}\left( k \right)} \right)$$
(A.22)

According to (A.7), the Eq. (A.21) is reduced to:

$$\nabla J_{k} \left( {{\text{W}}\left( k \right)} \right) = - {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {{\text{W}}\left( k \right)} \right){\text{E}}_{k} \left( {{\text{W}}\left( k \right)} \right)$$
(A.23)

Let us define:

$${\text{H}}\left( k \right) = \nabla^{2} J_{k} \left( {{\text{W}}\left( k \right)} \right)$$
(A.24)

Then we could write the Eq. (A.22) in the following form:

$${\text{H}}\left( k \right) = \alpha {\text{H}}\left( {k - 1} \right) + {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {{\text{W}}\left( k \right)} \right){\text{J}}_{{{\text{Y}}_{k} }} \left( {{\text{W}}\left( k \right)} \right)$$
(A.25)

Finally for the learning algorithm (A.6), we could obtain:

$${\text{W}}\left( {k + 1} \right) = {\text{W}}\left( k \right) + \left( {\nabla^{2} J_{k} \left( {{\text{W}}\left( k \right)} \right)} \right)^{ - 1} {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {{\text{W}}\left( k \right)} \right){\text{E}}_{k} \left( {{\text{W}}\left( k \right)} \right)$$
(A.26)
$${\text{W}}\left( {k + 1} \right) = {\text{W}}\left( k \right) + {\text{H}}^{ - 1} \left( k \right){\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {{\text{W}}\left( k \right)} \right){\text{E}}_{k} \left( {{\text{W}}\left( k \right)} \right)$$
(A.27)

The Eq. (A.27) corresponds to the Gauss-Newton learning method where the considered Hessian matrix is an approximation to the real one.

Let us now to come back to Eq. (A.19).

$${\text{H}}\left( k \right) = \nabla^{2} J_{k} \left( {\text{W}} \right) = \sum\limits_{q = 1}^{k} {\alpha^{k - q} {\text{J}}_{{{\text{Y}}_{q} }}^{T} \left( {\text{W}} \right){\text{J}}_{{{\text{Y}}_{q} }} \left( {\text{W}} \right)}$$
(A.28)

Here we observe that the product JT J could be nonsingular which require to perform the following modification of the Hessian matrix:

$${\text{H}}\left( k \right) = {\mathbf{\nabla }}^{2} J_{k} \left( {\text{W}} \right) = \sum\limits_{q = 1}^{k} {\alpha^{k - q} \left( {{\text{J}}_{{{\text{Y}}_{q} }}^{T} \left( {\text{W}} \right){\text{J}}_{{{\text{Y}}_{q} }} \left( {\text{W}} \right) + \rho {\text{I}}} \right)}$$
(A.29)

The Hessian matrix could be written also in the form:

$${\text{H}}\left( k \right) = \alpha {\text{H}}\left( {k - 1} \right) + {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {\text{W}} \right){\text{J}}_{{{\text{Y}}_{k} }} \left( {\text{W}} \right) + \rho {\text{I}}$$
(A.30)

where ρ is a small constant (generally ρ is chosen between 10−2 and 10−4).

This modification of the Hessian matrix is essential for the optimal learning method of Levenberg-Marquardt. The computation of the Hessian matrix inverse could be done using the matrix inversion lemma which requires the following modification:

$${\text{H}}\left( k \right) = \alpha {\text{H}}\left( {k - 1} \right) + {\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {\text{W}} \right){\text{J}}_{{{\text{Y}}_{k} }} \left( {\text{W}} \right) + \rho {\text{I}}_{{N_{w} }}$$
(A.31)

where \({\text{I}}_{{N_{w} }}\) is a N w  × N w zero matrix except one element (with value 1) corresponding to position i = (k mod N w ) +1. It could be seen that after N w iterations, the Eq. (A.31) become equal to Eq. (A.30), i.e.:

$$\sum\limits_{n = k + 1}^{{k + N_{w} }} {\rho {\text{I}}_{{N_{w} }} } \left( n \right) = \rho {\text{I}}$$
(A.32)

Then the Eq. (A.31) is transformed to:

$${\text{H}}\left( k \right) = \alpha {\text{H}}\left( {k - 1} \right) +\Omega ^{T} \left( k \right)\Lambda ^{ - 1} \left( k \right)\Omega \left( k \right)$$
(A.33)

where:

$$\Omega \left( k \right) = \left[ {\begin{array}{*{20}c} {{\text{J}}_{{{\text{Y}}_{k} }} \left( {\text{W}} \right)} \\ {\begin{array}{*{20}c} 0 & \cdots & 0 & 1 & 0 & \cdots & 0 \\ \end{array} } \\ \end{array} } \right]$$
(A.34)
$$\Lambda \left( k \right)^{ - 1} = \left[ {\begin{array}{*{20}c} {\text{I}} & 0 \\ 0 & \rho \\ \end{array} } \right]$$
(A.35)

Then it is easy to apply the matrix inversion lemma, which constitute to the following equation (where the matrices A, B, C and D have compatible dimensions and the product BCD, and the sum A + BCD exists):

$$\left[ {{\text{A}} + {\text{BCD}}} \right]^{ - 1} = {\text{A}}^{ - 1} - {\text{A}}^{ - 1} {\text{B}}\left[ {{\text{DA}}^{ - 1} {\text{B}} + {\text{C}}^{ - 1} } \right]^{ - 1} {\text{DA}}^{ - 1}$$
(A.36)

Let us to apply the following substitutions:

$${\text{A}} = \alpha {\text{H}}\left( {k - 1} \right);{\text{B}} =\Omega ^{T} \left( k \right);{\text{C}} =\Lambda ^{ - 1} \left( k \right);{\text{D}} =\Omega \left( k \right)$$

The inverse of the Hessian matrix H(k) could be computed using the expression:

$$\begin{aligned} {\text{H}}^{ - 1} \left( k \right) = & \,\left[ {\alpha {\text{H}}\left( {k - 1} \right) +\Omega ^{T} \left( k \right)\Lambda ^{ - 1} \left( k \right)\Omega \left( k \right)} \right]^{ - 1} = \alpha^{ - 1} {\text{H}}^{ - 1} \left( {k - 1} \right) \\ & \, - \alpha^{ - 1} {\text{H}}^{ - 1} \left( {k - 1} \right)\Omega ^{T} \left( k \right)\left[ {\Omega \left( k \right)\alpha^{ - 1} {\text{H}}^{ - 1} \left( {k - 1} \right)\Omega ^{T} \left( k \right) +\Lambda \left( k \right)} \right]^{ - 1} \\ & \,\Omega \left( k \right)\alpha^{ - 1} {\text{H}}^{ - 1} \left( {k - 1} \right) \\ \end{aligned}$$
(A.37)
$$\begin{aligned} {\text{H}}^{ - 1} \left( k \right) = & \,\alpha^{ - 1} \left\{ {{\text{H}}^{ - 1} \left( {k - 1} \right)} \right. \\ & \, - {\text{H}}^{ - 1} \left( {k - 1} \right)\Omega ^{T} \left( k \right)\left[ {\Omega \left( k \right){\text{H}}^{ - 1} \left( {k - 1} \right)\Omega ^{T} \left( k \right) + \alpha\Lambda \left( k \right)} \right]^{ - 1} \\ & \,\Omega \left( k \right)\left. {{\text{H}}^{ - 1} \left( {k - 1} \right)} \right\} \\ \end{aligned}$$
(A.38)

Let us denote

$${\text{P}}\left( k \right) = {\text{H}}^{ - 1} \left( k \right)$$

and substitute it in the Eq. (A.38), we obtained:

$${\text{P}}\left( k \right) = \alpha^{ - 1} \left\{ {{\text{P}}\left( {k - 1} \right)} \right. - {\text{P}}\left( {k - 1} \right)\Omega ^{T} \left( k \right){\text{S}}^{ - 1} \left( k \right)\Omega \left( k \right)\left. {{\text{P}}\left( {k - 1} \right)} \right\}$$
(A.39)

where:

$${\text{S}}\left( k \right) = \alpha\Lambda \left( k \right) +\Omega \left( k \right){\text{P}}\left( {k - 1} \right)\Omega ^{T} \left( k \right)$$
(A.40)

Finally, the learning algorithm for w is obtained as:

$${\text{W}}\left( {k + 1} \right) = {\text{W}}\left( k \right) + {\text{P}}\left( k \right){\text{J}}_{{{\text{Y}}_{k} }}^{T} \left( {{\text{W}}\left( k \right)} \right){\text{E}}_{k} \left( {{\text{W}}\left( k \right)} \right)$$
(A.41)

where W is a N w  × 1 vector formed of all RTNN weights (N w  = L × N + N + N × M).

Using the RTNN topology the weight vector has the following form:

$${\text{W}}\left( k \right) = \left[ {\begin{array}{*{20}c} {c_{1,1} } & \cdots & {c_{L,N} } & {a_{1,1} } & {a_{2,2} } & \cdots & {a_{N,N} } & {b_{1,1} } & \cdots & {b_{N,M} } \\ \end{array} } \right]^{T}$$
(A.42)

and the Jacobean matrix with dimension L × N w . is formed as:

$${\text{J}}_{{{\text{Y}}_{k} }} \left( {{\text{W}}\left( k \right)} \right) = \left[ {\begin{array}{*{20}c} {{\text{J}}_{{{\text{Y}}_{k} }} \left( {{\text{C}}\left( k \right)} \right)} & {{\text{J}}_{{{\text{Y}}_{k} }} \left( {{\text{A}}\left( k \right)} \right)} & {{\text{J}}_{{{\text{Y}}_{k} }} \left( {{\text{B}}\left( k \right)} \right)} \\ \end{array} } \right]$$
(A.43)

The components of the Jacobean matrix could be obtained applying the diagrammatic method [31]. Using the notation of part 2.2 for (A.43), we could write:

$${\text{DY}}\left[ {{\text{W}}\left( {\text{k}} \right)} \right] = \left[ {{\text{DY}}\left( {{\text{C}}_{\text{ij}} \left( {\text{k}} \right)} \right),{\text{DY}}\left( {{\text{A}}_{\text{ij}} \left( {\text{k}} \right)} \right),{\text{DY}}\left( {{\text{B}}_{\text{ij}} \left( {\text{k}} \right)} \right)} \right].$$

Appendix 2

Table A1 Abbreviations used in the chapter

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Baruch, I., Saldierna, E.E. (2014). Decentralized Fuzzy-Neural Identification and I-Term Adaptive Control of Distributed Parameter Bioprocess Plant. In: Balas, V., Koprinkova-Hristova, P., Jain, L. (eds) Innovations in Intelligent Machines-5. Studies in Computational Intelligence, vol 561. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43370-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-43370-6_1

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-43369-0

  • Online ISBN: 978-3-662-43370-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics