Simulation Results

For all the control schemes, the cart-pole simulation was run for 100 times and the number of successful swing-ups was collected. As in Kimura99, a swing-up is considered successful if the final angle is between $-0.133\pi$ and $0.133\pi$ , final angular velocity between

rad

s and

rad

s, and the cart has not crashed into the boundaries of the area during swing-up.

**Figure 2:** Example of a successful swing-up with NMPC. Upper figure: visual representation of the swing-up. Lower figure: time series of the swing-up. Solid line is applied force, line with crosses is position of the cart and line with dots is the angle.

Comparison of the performance of NMPC and OIC can be seen in Figure 3. With enough iterations, both methods reached a very high success rate. The few failed swing-ups were typically caused by difficult initial state of the cart-pole system resulting in an unfeasiable control strategy caused by limited horizon length. Example of a successful swing-up can be found in Figure 2.

**Figure 3:** Performance of the algorithms versus total computation time (in seconds). Dotted line is NMPC, dashed line is OIC. Numbers next to data points indicate number of iterations used. Control horizon length was 40 for all experiments.

On average, the traditional NMPC method was about 10 to 20 times slower than real-time on modern hardware (2.2 GHz AMD Opteron). The computation times for OIC were more varied, but in most cases the performance was inferior to NMPC. It should be noted, however, that the current implementation of OIC is quite heavily penalised by the presence of constraints, as the optimisation algorithm used cannot properly take their effects into account. In general, it is clear that further optimisations to the algorithms or improvements in hardware are required, before complex systems with fast dynamics can be controlled.

The importance of the horizon length to the performance of the NMPC can be seen in Figure 4. All horizon lengths between 30 and 45 time steps had similar performance. Horizon lengths between 25 and 30 had problems with the cart crashing to the walls. Horizons shorter than 25 time steps could not reliably perform the swing-up task because the reference signal became too unrealistic.

Very long horizons are also problematic. First of all, they increase the computational burden of the algorithm. The increase in the number of the parameters often also leads into increase in the number of local minima, which makes the optimisation problem more involved. In addition, because only an approximative model of the system is available, predictions far to the future become more unreliable. This can lead the algorithm to choose an optimisation strategy which is not feasible in practice.

Different initialisations for the NMPC control signal show that local minima are the chief problem with long horizons (Figure 4). It was observed that in most failed swing-ups the controller made a large prediction error, and in the following time instant was unable to recover from the local minimum where both the force penalty and the reference signal tracking penalty both suddenly became large. A more reasonable way to generate new initialisations in such situations is to either use random initialisations or to use the internal forward model to generate a new control signal.

**Figure 4:** The percentage of successful swing-ups as a function of the horizon length in NMPC. Solid line is using old predictions as initialisation, dotted line is using initialisations based on the internal forward model and dashed line is using the best out of ten random initialisations. 50 NMPC iterations were used for all experiments.