According to the general FA model the data has been generated by factors
**s** through mapping
**f**:

where

s_{l}(t) |
(4) | ||

e_{k}(t) |

The linear mapping
**f** used in FA is

The model is similar to principal component analysis except that FA includes the noise term and the factors have a Gaussian distribution. In NFA, the function

is used to model the nonlinearity. The parameter vector contains both

In NFA the data is modelled by a high dimensional manifold created
by function
**f** from a prior Gaussian distribution. It can be compared
to the self-organising map (SOM) [5], but the number
of parameters scale more like in FA. The SOM scales exponentially as
function of the dimensionality of the underlying data manifold. A
small number of parameters keeps the modelled manifold smooth.
We find the parameter vector
using ensemble learning.