Rigorous mathematical treatment of the SOM algorithm has turned out to be extremely difficult in general (reviews have been provided by Kangas, 1994; and Kohonen, 1995c). In the case of a discrete data set and a fixed neighborhood kernel, however, there exists a potential function for the SOM, namely [Kohonen, 1991, Ritter and Schulten, 1988]

where the index *c* depends on the and the reference vectors
(cf. Eq. 5).

The learning rule of the SOM, Equation 6, corresponds to a gradient descent step in minimizing the sample function

obtained by selecting randomly a sample at iteration
*t*. The learning rule then corresponds to a step in the stochastic
approximation of the minimum of Equation 7, as discussed by
Kohonen (1995c).

*Note:* In Equation 7 the index *c* is a function of
all the reference vectors, which implies that it may change when the
gradient descent step is taken. Locally, if the index
does not change for any , the gradient
step is valid, however.

Mon Mar 31 23:43:35 EET DST 1997