next up previous
Next: Simulation Results Up: Experiments Previous: Simulation

Implementation

The NDFA package Valpola02NC version 1.0.0, the scripts for running the experiments, and the used training data are publicly available[*].

During the training phase, data with 2500 samples was used. Most of the training data consisted of a sequence generated with semi-random control where the only goal was to ensure that the cart does not crash into the boundaries. Training data also contained some examples of hand-generated sections to better model the whole range of the observation and the dynamic mapping. The model was trained for 10000 iterations, which translates to several hours of computation time. Six-dimensional state space $ \mathbf{x}(t)$ was used because it resulted in a model with the lowest cost function (Eq. 5).

The state $ \mathbf{x}(t)$ was estimated using the iterated extended Kalman smoother Anderson79. A history of five observations and control signals seemed to suffice to give a reliable estimate. The reference signal $ \mathbf{r}$ was $ \phi=0$ and $ \phi'=0$ at the end of the horizon and for five observations beyond that.

To take care of the constraints in the system with NMPC, a slightly modified version of the cost function (7) was used. Out-of-bounds values of the location of the cart and the force incurred a quadratic penalty, and the full cost function is

$\displaystyle J_1(t_0,\mathbf{u}) =$ $\displaystyle J(t_0,\mathbf{u}) +$ (9)
  $\displaystyle \sum_{\tau = 1}^{T_c}(\max(10,\vert u(t_0 + \tau)\vert)-10)^2 +$    
  $\displaystyle \sum_{\tau = 1}^{T_c}(\max( 3,\vert y_s(t_0 + \tau)\vert)-3)^2,$    

where $ y_s(t)$ refers to the location component $ s$ of the observation vector $ \mathbf{y}(t)$.



next up previous
Next: Simulation Results Up: Experiments Previous: Simulation
Tapani Raiko 2006-08-24