Nonlinear model predictive control (NMPC) Mayne00 is based
on minimising a cost function defined over a future window
of fixed length
.
For example, the quadratic difference between the predicted future observations
and a reference signal
can be used:
Here, the states and observations (but not control signals) are
modelled probabilistically so we minimise the expected cost
BarShalom81. The current guess
defines a probability
distribution over future states and observations. This inference can
be done with a single forward pass, when ignoring the
internal forward model, that is, the dependency of the state on future control signals.
In this case, it makes sense to ignore the forward model anyway,
since the future control signals do not have to follow the learned policy.
Minimisation of is done with a quasi-Newton
algorithm Nocedal99. For that, the partial derivatives
for all
must be computed. For a single input system we can
simply apply the chain rule to arrive at the Jacobian
![]() |
(8) |
The use of a cost function makes NMPC very versatile. Costs for control signals and
observations can be set for instance to restrict values within bounds
etc. Expectations over a quadratic cost (Eq. 7) are easy to evaluate because:
,
where
is the dimensionality of the observation space
and
is the variance over the distribution
.
The two terms are the nominal and stochastic part of the cost function.
There is a direct analogy with dual control Astrom95 which
means balancing between good control and small estimation errors.
The usefulness of the
decomposition is discussed in BarShalom81.