In the first set of experiments we compared the computational performance of different algorithms for PCA with missing values. The root mean square (rms) error is measured on the training data, that is, the observed values in the training set. All experiments were run on a dual cpu AMD Opteron SE 2220.
The comparison methods, the imputation algoritm and the EM algorithm, were very slow, except for the first iteration of the imputation algorithm due to the complete data matrix being sparse. Fig. 2 (left) shows the learning curves. The closer a curve is to the origin, the faster the algorithm minimizes the cost function.
We also tested the subspace learning algorithm described in
Section 3 with and without the proposed speed-up,
starting from the same random starting point.
The learning rate was adapted such that if an update
decreased the cost function,
was multiplied by 1.1. Each time
an update would increase the cost, the update was canceled and
was divided by 2. The best
seemed to be around 0.6
to 0.7, the curve shown in Fig. 2 is for
. It gave a more than tenfold speed-up compared to the
gradient descent algorithm even if one iteration took on average 97
seconds against the gradient descent's 57 seconds.