next up previous
Next: Principal Component Analysis with Up: Algorithms for Principal Component Previous: EM Algorithm

Subspace Learning Algorithm

It is also possible to minimize the reconstruction error (2) by any optimization algorithm. Applying the gradient descent algorithm yields rules for simultaneous updates

$\displaystyle \mathbf{A}\leftarrow \mathbf{A}+ \gamma ( \mathbf{X}- \mathbf{A}\...
...tarrow \mathbf{S}+ \gamma \mathbf{A}^T ( \mathbf{X}- \mathbf{A}\mathbf{S}) \, .$ (6)

where $ \gamma>0$ is called the learning rate. Oja-Karhunen learning algorithm [8,9,6,4] is an online learning method that uses the EM formula for computing $ \mathbf{S}$ and the gradient for updating $ \mathbf{A}$, a single data vector at a time.

A possible speed-up to the subspace learning algorithm is to use the natural gradient [10] for the space of matrices. This yields the update rules

$\displaystyle \mathbf{A}\leftarrow \mathbf{A}+ \gamma ( \mathbf{X}- \mathbf{A}\...
...ma \mathbf{S}\mathbf{S}^T \mathbf{A}^T ( \mathbf{X}- \mathbf{A}\mathbf{S}) \, .$ (7)



If needed, the end result of subspace analysis can be transformed into the PCA solution, for instance, by computing the eigenvalue decomposition $ \mathbf{S}\mathbf{S}^T = \mathbf{U}_\mathbf{S}\mathbf{D}_\mathbf{S}\mathbf{U}_\mathbf{S}^T$ and the singular value decomposition $ \mathbf{A}\mathbf{U}_\mathbf{S}\mathbf{D}_\mathbf{S}^{1/2} = \mathbf{U}_\mathbf{A}\mathbf{\Sigma}_\mathbf{A}
\mathbf{V}_\mathbf{A}^T$. The transformed $ \mathbf{A}$ is formed from the first $ c$ columns of $ \mathbf{U}_\mathbf{A}$ and the transformed $ \mathbf{S}$ from the first $ c$ rows of $ \mathbf{\Sigma}_\mathbf{A}\mathbf{V}_\mathbf{A}^T\mathbf{D}_\mathbf{S}^{-1/2}\mathbf{U}_\mathbf{S}^T\mathbf{S}$. Note that the required decompositions are computationally lighter than the ones done to the data matrix directly.


next up previous
Next: Principal Component Analysis with Up: Algorithms for Principal Component Previous: EM Algorithm
Tapani Raiko 2007-09-11