Overfitting

Next: Discussion Up: Experiments Previous: Computational Performance

Overfitting

We compared PCA (Section 3), regularized PCA (Section 4) and VB-PCA (Section 4) by computing the rms reconstruction error for the validation set , that is, testing how the models generalize to new data: $\operatorname{E}_V=\sqrt{\frac{1}{\vert V\vert}\sum_{(i,j) \in V} e_{ij}^2}$ . We tested VB-PCA by firstly fixing some of the parameter values (this run is marked as VB1 in Fig. 1, see [12] for details) and secondly by adapting them (marked as VB2). We initialized regularized PCA and VB1 using normal PCA learned with $\alpha=0.625$ and orthogonalized $\mathbf{A}$ , and VB2 using VB1. The parameter $\alpha$ was set to .

Fig. 1 (right) shows the results. The performance of basic PCA starts to degrade during learning, especially using the proposed speed-up. Natural gradient diminishes this phenomenon known as overlearning, but it is even more effective to use regularization. The best results were obtained using VB2: The final validation error was 0.9180 and the training rms error was 0.7826 which is naturally larger than the unregularized .

Tapani Raiko 2007-09-11