Model structure

Next: Ensemble Learning Up: Linear and Nonlinear Factor Previous: Linear and Nonlinear Factor

Model structure

According to the general FA model the data has been generated by factors s through mapping f:

$\displaystyle \mathbf{x}(t) = \mathbf{f}(\mathbf{s}(t), \boldsymbol{\theta}) + \mathbf{e}(t) \ ,$

(3)

where x is a data vector, s is a factor vector, $\boldsymbol{\theta}$ is a parameter vector and e is a noise vector. The factors and the noise are assumed to be independent and Gaussian:

s_l(t)	$\textstyle \sim$	$\displaystyle N(0,\sigma_{l}^{2})$	(4)
e_k(t)	$\textstyle \sim$	$\displaystyle N(0,\xi_{k}^{2})$

The linear mapping f used in FA is

$\begin{displaymath}\mathbf{f}(\mathbf{s}, \boldsymbol{\theta}) = \mathbf{A}\mathbf{s} + \mathbf{b}\ . \end{displaymath}$

(5)

The model is similar to principal component analysis except that FA includes the noise term and the factors have a Gaussian distribution. In NFA, the function f is allowed to be nonlinear. We use the method proposed in [6], where the MLP network

$\begin{displaymath}\mathbf{f}(\mathbf{s}, \boldsymbol{\theta}) = \mathbf{A}_{2} \tanh(\mathbf{A}_{1}\mathbf{s}+\mathbf{b}_{1})+\mathbf{b}_{2} \end{displaymath}$

(6)

is used to model the nonlinearity. The parameter vector $\boldsymbol{\theta}$ contains both A and b.

In NFA the data is modelled by a high dimensional manifold created by function f from a prior Gaussian distribution. It can be compared to the self-organising map (SOM) [5], but the number of parameters scale more like in FA. The SOM scales exponentially as function of the dimensionality of the underlying data manifold. A small number of parameters keeps the modelled manifold smooth. We find the parameter vector $\boldsymbol{\theta}$ using ensemble learning.

Next: Ensemble Learning Up: Linear and Nonlinear Factor Previous: Linear and Nonlinear Factor

Tapani Raiko
2001-09-26