Optimistic Inference Control

Next: Initialisation for Control Up: Control Schemes Previous: Nonlinear Model Predictive Control

Optimistic Inference Control

Optimistic inference control (OIC) as described by Raiko05IJCNN and by attias03planning independently, works as follows. Assume that after a fixed delay

, the desired goal is reached. That is, (some components of) the observations $\mathbf{x}$ are at the desired level $\mathbf{r}$ . Given this optimistic assumption and the observations and control signals so far, infer what happens in between. Then choose the expectation of $q(\mathbf{u}(t_0))$ . It should be noted that whereas NMPC can be used with a wide variety of different models, OIC requires a probabilistic internal forward model.

OIC propagates the evidence in two directions, forwards from the current state and, additionally, the evidence backwards from the desired future. The inference is conceptually simple, but algorithmically difficult. The information from the future needs to flow through tens of nonlinear mappings $\mathbf{g}$ before it affects $\mathbf{u}(t_0)$ . The OIC algorithm as presented in Raiko05IJCNN only propagates information one step forward and backward in time for each iteration. To speed up this process, total derivatives described in Raiko06ICA are used to replace the partial derivatives, which leads to much faster propagation of information. Another alternative for fast inference is the Extended Kalman Smoother Anderson79, which unfortunately suffers from stability issues and it is therefore only used to initialise the OIC algorithm.

OIC in a nutshell:

Given observations $\dots,\mathbf{y}(t_0-2),\mathbf{y}(t_0-1)$ and

control signals $\dots,\mathbf{u}(t_0-2),\mathbf{u}(t_0-1)$

1: Fix future $\mathbf{y}(t_0+T_c)=\mathbf{y}(t_0+T_c+1)=$

$=\dots=\mathbf{r}$

2: Infer the distribution $q(\mathbf{u}(t),\mathbf{x}(t),\mathbf{y}(t))$ for all

3: Select the mean of $q(\mathbf{u}(t_0))$ as the control signal

4: Observe $\mathbf{y}(t_0)$ and release $\mathbf{y}(t_0+T_c)$

5: Increase and loop from

In case there are constraints for control signals or observations, they are forced after every inference iteration. If the horizon is set too short or the goal is otherwise overoptimistic, the method becomes unreliable. Even with a realistic goal, it is not in general guaranteed that the iteration will converge to the optimal control signal, as the iteration may get stuck in a local minimum. The inferred control signals can be validated by releasing the optimistic future and re-inferring. If the future changes a lot, the control is unreliable.

Next: Initialisation for Control Up: Control Schemes Previous: Nonlinear Model Predictive Control

Tapani Raiko 2006-08-24