Principal subspace and components

Next: Singular Value Decomposition Up: Algorithms for Principal Component Previous: Algorithms for Principal Component

Principal subspace and components

Assume that we have -dimensional data vectors ${\bf x}_1,{\bf x}_2,\ldots,{\bf x}_n$ , which form the $d\times n$ data matrix $\mathbf{X}$ = $[{\bf x}_1,{\bf x}_2,\ldots,{\bf x}_n]$ . The matrix $\mathbf{X}$ is decomposed into

$\displaystyle \mathbf{X}\approx \mathbf{A}\mathbf{S},$

(1)

where $\mathbf{A}$ is a $d\times c$ matrix, $\mathbf{S}$ is a $c\times n$ matrix and $c\leq d \leq n$ . Principal subspace methods [6,4] find $\mathbf{A}$ and $\mathbf{S}$ such that the reconstruction error

$\displaystyle C = \hspace{2mm} \lVert \mathbf{X}- \mathbf{A}\mathbf{S}\rVert _F... ...e{2mm} \sum_{i=1}^d \sum_{j=1}^n ( x_{ij} - \sum_{k=1}^c a_{ik} s_{kj} )^2\, ,$

(2)

is minimized. There

denotes the Frobenius norm, and $x_{ij}$ , $a_{ik}$ , and $s_{kj}$ elements of the matrices $\mathbf{X}$ , $\mathbf{A}$ , and $\mathbf{S}$ , respectively. Typically the row-wise mean is removed from $\mathbf{X}$ as a preprocessing step.

Without any further constraints, there exist infinitely many ways to perform such a decomposition. However, the subspace spanned by the column vectors of the matrix $\mathbf{A}$ , called the principal subspace, is unique. In PCA, these vectors are mutually orthogonal and have unit length. Further, for each $k=1,\dots ,c$ , the first vectors form the -dimensional principal subspace. This makes the solution practically unique, see [4,2,5] for details.

There are many ways to determine the principal subspace and components [6,4,2]. We will discuss three common methods that can be adapted for the case of missing values.

Next: Singular Value Decomposition Up: Algorithms for Principal Component Previous: Algorithms for Principal Component

Tapani Raiko 2007-09-11