next up previous
Next: Adapting the EM Algorithm Up: Principal Component Analysis with Previous: Principal Component Analysis with


Adapting SVD

One can use the SVD approach (4) in order to find an approximate solution to the PCA problem. However, estimating the covariance matrix $ \mathbf{C}$ becomes very difficult when there are lots of missing values. If we estimate $ \mathbf{C}$ leaving out terms with missing values from the average, we get for the estimate of the covariance matrix

$\displaystyle \mathbf{C}= \frac{1}{n} \mathbf{X}\mathbf{X}^T =
 \begin{bmatrix}
 0.5 & 1 & 0 \\ 
 1 & 0.667 & ? \\ 
 0 & ? & 1
 \end{bmatrix}\, .$ (9)

There are at least two problems. First, the estimated covariance $ 1$ between the first and second components is larger than their estimated variances $ 0.5$ and $ 0.667$. This is clearly wrong, and leads to the situation where the covariance matrix is not positive (semi)definite and some of its eigenvalues are negative. Secondly, the covariance between the second and the third component could not be estimated at all[*]. Both problems appeared in practice with the data set considered in Section 5.

Another option is to complete the data matrix by iteratively imputing the missing values (see, e.g., [2]). Initially, the missing values can be replaced by zeroes. The covariance matrix of the complete data can be estimated without the problems mentioned above. Now, the product $ \mathbf{A}\mathbf{S}$ can be used as a better estimate for the missing values, and this process can be iterated until convergence. This approach requires the use of the complete data matrix, and therefore it is computationally very expensive if a large part of the data matrix is missing. The time complexity of computing the sample covariance matrix explicitly is $ O(nd^2)$. We will further refer to this approach as the imputation algorithm.

Note that after convergence, the missing values do not contribute to the reconstruction error (2). This means that the imputation algorithm leads to the solution which minimizes the reconstruction error of observed values only.


next up previous
Next: Adapting the EM Algorithm Up: Principal Component Analysis with Previous: Principal Component Analysis with
Tapani Raiko 2007-09-11