PCA can be generalized to form nonlinear curves. While in PCA a good
projection of a data set onto a linear manifold was constructed, the
goal in constructing a principal curve is to project the set onto a
nonlinear manifold. The principal curves [Hastie and Stuetzle, 1989] are smooth
curves that are defined by the property that each point of the curve
is the average of all data points that project to it, i.e., for which
that point is the closest point on the curve. Intuitively speaking,
the curves pass through the ``center'' of the data set. Principal
curves are generalizations of principal components extracted using PCA
in the sense that a linear principal curve is a principal component;
the connections between the two methods are delineated more carefully
in the original article. Although the extracted structures are called
principal *curves* the generalization to surfaces seems relatively
straightforward, although the resulting algorithms will become
computationally more intensive.

The conception of continuous principal curves may aid in understanding how principal components could be sensibly generalized. To be useful in practical computations, however, the curves must be discretized. It has turned out [Mulier and Cherkassky, 1995, Ritter et al., 1992] that discretized principal curves are essentially equivalent to SOMs, introduced before Hastie and Stuetzle (1989) introduced the principal curves. It thus seems that the conception of principal curves is most useful in providing one possible viewpoint to the properties of the SOM algorithm.

Mon Mar 31 23:43:35 EET DST 1997