Abstract
Discrete-time stochastic optimal control problems are considered. These problems are stated over a finite number of decision stages. The state vector is assumed to be observed through a noisy measurement channel. Because of the very general assumptions under which the problems are stated, obtaining analytically optimal solutions is practically impossible. Note that the controller has to retain the vector of all the measures and of all the controls in memory, up to the most recent decision stage. Such measures and controls constitute the “information vector” that the control function depends on. The increasing dimension of the information vector makes it practically impossible to use dynamic programming. Then, we resort to the “extended Ritz method” (ERIM). The ERIM consists in substituting the admissible functions with fixed-structure parametrized functions containing vectors of “free” parameters. Of course, if the number of decision stages is large, the application of the ERIM is also impossible. Therefore, an approximate approach is followed by truncating the information vector and retaining in the memory only a suitable “limited-memory information vector.”
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We recall three basic operations on the conditional probability density functions that are used in this section.
-
The so-called chain rule:
$$\begin{aligned} p(\varvec{a},\varvec{b}\, \vert \, \varvec{c}) = p(\varvec{a}\, \vert \, \varvec{b}, \varvec{c}) \, p(\varvec{b}\, \vert \, \varvec{c}) \, . \end{aligned}$$(8.14) -
The integrated version of (8.14):
$$\begin{aligned} p(\varvec{a}\, \vert \, \varvec{c}) = \int \, p(\varvec{a}\, \vert \, \varvec{b}, \varvec{c}) \, p(\varvec{b}\, \vert \, \varvec{c}) \, d \varvec{b}\, . \end{aligned}$$(8.15) -
The Bayes formula:
$$\begin{aligned} p(\varvec{a}\, \vert \, \varvec{b}, \varvec{c}) = \frac{p(\varvec{b}\, \vert \, \varvec{a}, \varvec{c}) \, p(\varvec{a}\, \vert \, \varvec{c})}{\displaystyle \int \, (\mathrm{numerator}) \, d \varvec{a}} \, . \end{aligned}$$(8.16)
-
References
Alessandri A, Baglietto M, Battistelli G (2008) Moving-horizon state estimation for nonlinear discrete-time systems: new stability results and approximation schemes. Automatica 44:1753–1765
Alessandri A, Baglietto M, Battistelli G, Gaggero M (2011) Moving-horizon state estimation for nonlinear systems using neural networks. IEEE Trans. Neural Netw 22:768–780
Alessandri A, Baglietto M, Parisini T, Zoppoli R (1999) A neural state estimator with bounded errors for nonlinear systems. IEEE Trans Autom Control 44:2028–2042
Alessandri A, Parisini T, Zoppoli R (1997) Neural approximations for nonlinear finite-memory state estimation. Int J Control 67:275–302
Alessandri A, Parisini T, Zoppoli R (2001) Sliding-window neural state estimation in a power plant heater line. Int J Adapt Control Signal Process 15:815–836
Aoki M (1967) Optimization of stochastic systems. Academic Press
Aoki M (1989) Optimization of stochastic systems. Academic Press, 2nd edn
Bertsekas DP (2005) Dynamic programming and optimal control, vol 1. Athena Scientific
Bertsekas DP (2005) Dynamic programming and suboptimal control: a survey from ADP to MPC. Eur J Control 11:310–334
Chui CK, Chen G (2017) Kalman filtering. Springer, 5th edn
Garcia CE, Prett DM, Morari M (1989) Model predictive control: theory and practice—a survey. Automatica 25:335–348
Jazwinski AH (1968) Limited memory optimal filters. IEEE Trans Autom Control 13:558–563
Jazwinski AH (1970) Stochastic processes and filtering theory. Academic Press
Lemon A, Lall S (2016) Approximate sufficient statistics for team decision problems. In: 2016 AAAI spring symposium series. Challenges and opportunities in multiagent learning for the real world
Mayne DQ, Rawlings JB, Rao CV, Scokaert POM (2000) Constrained model predictive control: stability and optimality. Automatica 36:789–814
Messner A, Papageorgiou M (1992) Motorway network control via nonlinear optimization. In: Proceedings of the 1st meeting of the EURO working group on urban traffic and transportation, pp 1–24
Papageorgiou M (1983) Applications of automatic control concepts to traffic flow modeling and control. Lecture Notes in Control and Information Sciences. Springer
Parisini T, Zoppoli R (1994) Neural networks for nonlinear state estimation. Int J Robust and Nonlinear Control 4:231–248
Parisini T, Zoppoli R (1996) Neural approximations for multistage optimal control of nonlinear stochastic systems. IEEE Trans Autom Control 41:889–895
Parisini T, Zoppoli R (1998) Neural approximations for infinite-horizon optimal control of nonlinear stochastic systems. IEEE Trans Neural Netw 9:1388–1408
Payne HJ (1971) Models of freeway traffic and control. Simul Council Proc 1:51–61
Rawlings JB, Mayne DQ, Diehl MM (2018) Model predictive control: theory, computation, and design. Nob Hill Publishing
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Zoppoli, R., Sanguineti, M., Gnecco, G., Parisini, T. (2020). Stochastic Optimal Control with Imperfect State Information over a Finite Horizon. In: Neural Approximations for Optimal Control and Decision. Communications and Control Engineering. Springer, Cham. https://doi.org/10.1007/978-3-030-29693-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-29693-3_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29691-9
Online ISBN: 978-3-030-29693-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)