next up previous
Next: Discussion Up: Experiments Previous: Computational Performance

Overfitting

We compared PCA (Section 3), regularized PCA (Section 4) and VB-PCA (Section 4) by computing the rms reconstruction error for the validation set $ V$, that is, testing how the models generalize to new data: $ \operatorname{E}_V=\sqrt{\frac{1}{\vert V\vert}\sum_{(i,j) \in V} e_{ij}^2}$. We tested VB-PCA by firstly fixing some of the parameter values (this run is marked as VB1 in Fig. 1, see [12] for details) and secondly by adapting them (marked as VB2). We initialized regularized PCA and VB1 using normal PCA learned with $ \alpha=0.625$ and orthogonalized $ \mathbf{A}$, and VB2 using VB1. The parameter $ \alpha$ was set to $ 2/3$.

Fig. 1 (right) shows the results. The performance of basic PCA starts to degrade during learning, especially using the proposed speed-up. Natural gradient diminishes this phenomenon known as overlearning, but it is even more effective to use regularization. The best results were obtained using VB2: The final validation error $ E_V$ was 0.9180 and the training rms error $ E_O$ was 0.7826 which is naturally larger than the unregularized $ E_O=0.7657$.



Tapani Raiko 2007-09-11