Subspace Learning Algorithm

Next: Principal Component Analysis with Up: Algorithms for Principal Component Previous: EM Algorithm

Subspace Learning Algorithm

It is also possible to minimize the reconstruction error (2) by any optimization algorithm. Applying the gradient descent algorithm yields rules for simultaneous updates

$\displaystyle \mathbf{A}\leftarrow \mathbf{A}+ \gamma ( \mathbf{X}- \mathbf{A}\... ...tarrow \mathbf{S}+ \gamma \mathbf{A}^T ( \mathbf{X}- \mathbf{A}\mathbf{S}) \, .$

(6)

where $\gamma>0$ is called the learning rate. Oja-Karhunen learning algorithm [8,9,6,4] is an online learning method that uses the EM formula for computing $\mathbf{S}$ and the gradient for updating $\mathbf{A}$ , a single data vector at a time.

A possible speed-up to the subspace learning algorithm is to use the natural gradient [10] for the space of matrices. This yields the update rules

$\displaystyle \mathbf{A}\leftarrow \mathbf{A}+ \gamma ( \mathbf{X}- \mathbf{A}\... ...ma \mathbf{S}\mathbf{S}^T \mathbf{A}^T ( \mathbf{X}- \mathbf{A}\mathbf{S}) \, .$

(7)

If needed, the end result of subspace analysis can be transformed into the PCA solution, for instance, by computing the eigenvalue decomposition $\mathbf{S}\mathbf{S}^T = \mathbf{U}_\mathbf{S}\mathbf{D}_\mathbf{S}\mathbf{U}_\mathbf{S}^T$ and the singular value decomposition $\mathbf{A}\mathbf{U}_\mathbf{S}\mathbf{D}_\mathbf{S}^{1/2} = \mathbf{U}_\mathbf{A}\mathbf{\Sigma}_\mathbf{A} \mathbf{V}_\mathbf{A}^T$ . The transformed $\mathbf{A}$ is formed from the first columns of $\mathbf{U}_\mathbf{A}$ and the transformed $\mathbf{S}$ from the first rows of $\mathbf{\Sigma}_\mathbf{A}\mathbf{V}_\mathbf{A}^T\mathbf{D}_\mathbf{S}^{-1/2}\mathbf{U}_\mathbf{S}^T\mathbf{S}$ . Note that the required decompositions are computationally lighter than the ones done to the data matrix directly.

Next: Principal Component Analysis with Up: Algorithms for Principal Component Previous: EM Algorithm

Tapani Raiko 2007-09-11