Nonlinear extensions of factor analysis can be divided into at least three categories: 1) Using multiple clusters that each resemble FA; 2) Mapping the data nonlinearly to a higher dimensional feature space and then performing linear computations there; 3) Finding the nonlinear shape of the data distribution. The model used in this thesis fits into the third category.
Examples from each category will be described. From the first category, a mixture model is considered. The self-organising map fits somewhere between the first and the third category. Nonlinear Hebbian learning [54,53] and nonlinear component analysis (NCA) or kernel-PCA are examples of the second category. Examples from the third category include nonlinear factor analysis and principal curves [23]. Most of the previous work in nonlinear supervised learning tasks can be applied to nonlinear factor analysis and therefore they are considered for background.