In the first set of experiments we compared the computational
performance of different algorithms on PCA with missing values. The
root mean square (rms) error is measured on the training data,
.
All experiments were run on a dual cpu AMD Opteron SE 2220 using
Matlab.
First, we tested the imputation algorithm. The first iteration where the
missing values are replaced with zeros, was completed in 17 minutes
and led to
.
This iteration was still tolerably fast because the complete
data matrix was sparse. After that, it takes about 30 hours per iteration.
After three iterations,
was still 0.8513.
Using the EM algorithm by [11], the E-step (updating
)
takes 7 hours and the M-step (updating
) takes 18 hours. (There
is some room for optimization since we used a straightforward Matlab
implementation.) Each iteration gives a much larger improvement
compared to the imputation algorithm, but starting from a random
initialization, EM could not reach a good solution in reasonable time.
We also tested the subspace learning algorithm described in
Section 3 with and without the proposed speed-up. Each run of the
algorithm with different values of the speed-up parameter was
initialized in the same starting point (generated randomly from a
normal distribution). The learning rate
was adapted such that
if an update decreased the cost function,
was multiplied by
1.1. Each time an update would increase the cost, the update was
canceled and
was divided by 2. Figure
1 (left) shows the learning curves for basic
gradient descent, natural gradient descent, and the proposed speed-up
with the best found parameter value
. The proposed
speed-up gave about a tenfold speed-up compared to the gradient
descent algorithm even if each iteration took
longer. Natural gradient was slower than the basic gradient.
Table 1 gives a summary of the computational complexities.
![]() |