next up previous
Next: Optimistic Inference Control Up: Control Schemes Previous: Control Schemes

Nonlinear Model Predictive Control (NMPC)

Nonlinear model predictive control (NMPC) Mayne00 is based on minimising a cost function $ J$ defined over a future window of fixed length $ T_c$. For example, the quadratic difference between the predicted future observations $ \mathbf{y}$ and a reference signal $ \mathbf{r}$ can be used:

$\displaystyle J(\mathbf{y}(t_0),\mathbf{u}(t_0),\dots,\mathbf{u}(t_0+T_c-1)) =$ (7)
$\displaystyle \sum_{\tau=1}^{T_c}\left\vert\mathbf{y}(t_0+\tau)-\mathbf{r}\right\vert^2.$    

Then $ J$ is minimised w.r.t. the control signals $ \mathbf{u}$ and the first one $ \mathbf{u}(t_0)$ is executed. Direct analogy to decision theory is revealed when the control cost $ J$ is interpreted as negative utility.

Here, the states and observations (but not control signals) are modelled probabilistically so we minimise the expected cost $ E_q\{J\}$ BarShalom81. The current guess $ \mathbf{u}(t_0),\dots,\mathbf{u}(t_0+T_c-1)$ defines a probability distribution over future states and observations. This inference can be done with a single forward pass, when ignoring the internal forward model, that is, the dependency of the state on future control signals. In this case, it makes sense to ignore the forward model anyway, since the future control signals do not have to follow the learned policy.

Minimisation of $ E_q\{J\}$ is done with a quasi-Newton algorithm Nocedal99. For that, the partial derivatives $ Y^{t_2} = \partial \mathbf{y}(t_2) / \partial \mathbf{u}(t_1)$ for all $ t_0\leq t_1
< t_2 \leq t_0+T_c$ must be computed. For a single input system we can simply apply the chain rule to arrive at the Jacobian

$\displaystyle Y^{t_2} = F^{t_2} G^{t_2-1} \cdots G^{t_1+1} G^{t_1},$ (8)

where $ F^{t}$ and $ G^{t}$ are the Jacobians of the mappings $ f$ and $ g$ at the the time instant t. Dynamic programming can be used to efficiently compute these partial derivatives for multiple values of $ t_1$ and $ t_2$ in linear time. The extension of this result to multi-input systems is also relatively straightforward.

The use of a cost function makes NMPC very versatile. Costs for control signals and observations can be set for instance to restrict values within bounds etc. Expectations over a quadratic cost (Eq. 7) are easy to evaluate because: $ E_q\left\{\left\vert\mathbf{y}(t)-\mathbf{r}\right\vert^2 \right\}
= \left\ve...
...-\mathbf{r} \right\vert^2 +
\sum_{i=1}^{n} \mathrm{Var}_q\left\{y_i(t)\right\}$, where $ n$ is the dimensionality of the observation space $ \mathbf{y}$ and $ \mathrm{Var}\left\{\cdot\right\}$ is the variance over the distribution $ q$. The two terms are the nominal and stochastic part of the cost function. There is a direct analogy with dual control Astrom95 which means balancing between good control and small estimation errors. The usefulness of the decomposition is discussed in BarShalom81.



next up previous
Next: Optimistic Inference Control Up: Control Schemes Previous: Control Schemes
Tapani Raiko 2006-08-24