We start with a toy problem which is easy to visualise. A set of 1000
data points, shown on the left of Fig. 1, was generated
from a normally distributed source *s* into a helical subspace. The
mapping onto x-, y- and z-axes were
,
and *z*=*s*. Gaussian noise with standard deviation 0.05 was added
to all three data components.

Several different initialisations of the MLP networks with different
number of hidden neurons were tested and the network which produced
the lowest value of the cost function was chosen. The best network
had 16 hidden neurons and it had estimated the noise level of
different data components to be 0.052, 0.055 and 0.050. The outputs
of the network for each estimated value of the source
signal^{1} together with the original helical subspace
are shown on the right of Fig.1. It is evident that the
network has learned to represent the underlying one-dimensional
subspace and has been able to separate the signal from noise.