 
 
 
 
 
 
 
  
Model predictive control (see Morari and Lee, 1999, for a survey) aims
at controlling a dynamical system by using a predictive model. 
Control inputs
 are added to the nonlinear state-space model. In publication
IV this is done by modifying the system dynamics in
Equation (4.10) by
 are added to the nonlinear state-space model. In publication
IV this is done by modifying the system dynamics in
Equation (4.10) by
 are
not coming from outside the model, but they are modelled as well.
Whereas feedback control in Equation (3.24) models
control inputs as a fixed function of the observations,
Equation (4.13) only gives a distribution for the control
inputs and leaves the exact selection open.
 are
not coming from outside the model, but they are modelled as well.
Whereas feedback control in Equation (3.24) models
control inputs as a fixed function of the observations,
Equation (4.13) only gives a distribution for the control
inputs and leaves the exact selection open.
Publication IV studies three different control schemes
in this setting. Direct control is based on using the internal forward
model directly by selecting the mean of the probability distribution
given by Equation (4.13). Direct control is fast to use,
but the learning of the mapping 
 is hard to do well.
 is hard to do well.
The second control scheme is nonlinear model-predictive control (see e.g. Eduardo Fernández Camacho, 2004), which
is based on optimising control signals based on maximising a utility
function. First, an initial guess for the control signals 
 is
made. The posterior distribution of the future states are inferred.
The gradient of total expected utility with respect to states and
control signals is propagated backwards in time. Control signals are
then updated in the direction of increasing utility. This process is
iterated as long as there is time before the next control signal needs
to be selected. Nonlinear model-predictive control can be seen as
applying decision theory (see Sections
2.4 and 3.2.4).
 is
made. The posterior distribution of the future states are inferred.
The gradient of total expected utility with respect to states and
control signals is propagated backwards in time. Control signals are
then updated in the direction of increasing utility. This process is
iterated as long as there is time before the next control signal needs
to be selected. Nonlinear model-predictive control can be seen as
applying decision theory (see Sections
2.4 and 3.2.4).
Optimistic inference control, introduced in Publication IV, is the third studied control scheme. It is based on Bayesian inference answering the question: ``Assuming success in the end, what will happen in near future?'' Control signal is inferred given the history of observations and assuming wanted observations after a gap of missing values. Inference combines the internal forward model with the evidence propagating backwards from the desired future. Optimistic inference control lacks in flexibility and theoretical foundation compared to nonlinear model-predictive control, but it provides a link between two problems: inference and control. It gave the inspiration for the inference algorithm introduced in Publication V. Tornio and Raiko (2006) apply the algorithm back in control. Attias (2003) independently discovered the idea behind optimistic inference control, calling it planning by probabilistic inference. His example, finding a goal in a grid world, is quite different from control, but the underlying idea is still the same.
The proposed control methods were tested with a cart-pole swing-up task in Figure 4.2. Figure 4.3 illustrates the model predictive control in action. The experimental results in Publication IV confirm that selecting actions based on a state-space model instead of the observation directly has many benefits: First, it is more resistant to noise because it implicitly involves filtering. Second, the observations (without history) do not always carry enough information about the system state. Third, when nonlinear dynamics are modelled by a function approximator such as an MLP network, a state-space model can find such a representation of the state that it is more suitable for the approximation and thus more predictable.
| ![\begin{figure}\centering\psfrag{F} [cc][cc]{$F$}\psfrag{y} [cc][cc]{$x$}\p...
...[cc][cc]{$\theta$}
\epsfig{file=cartpole.eps,width=0.95\linewidth}\end{figure}](img252.png) | 
| ![\includegraphics[width=0.95\textwidth, trim=5mm 5mm 5mm 7mm, clip]{swingup.eps}](img253.png)  | 
Model development is by far the most critical and time-consuming step in implementing a model predictive controller (Morari and Lee, 1999). The analysis in Publication IV is of course very shallow compared to the huge mass of control literature but there seems to be need for sophisticated model learners (or system identifiers). For instance, Rosenqvist and Karlström (2005) also learn a nonlinear state-space model for control. The nonlinearities are modelled using piecewise linear mappings. Parameters are estimated using the prediction error method, which is equivalent to the maximum likelihood estimation in the Bayesian framework.
 
 
 
 
 
 
