Competitive learning is an adaptive process in which the neurons in a neural network gradually become sensitive to different input categories, sets of samples in a specific domain of the input space [Amari, 1980, Didday, 1970, Didday, 1976, Grossberg, 1976, Kohonen, 1982, Kohonen, 1984, Nass and Cooper, 1975, Pérez et al., 1975, Swindale, 1980, von der Malsburg, 1973]. A kind of a division of labor emerges in the network when different neurons specialize to represent different types of inputs.
The specialization is enforced by competition among the neurons:
when an input arrives, the neuron that is best able to represent
it wins the competition and is allowed to learn it even better, as
will be described below.
If there exists an ordering between the neurons, i.e., the neurons are located on a discrete lattice, the self-organizing map, the competitive learning algorithm can be generalized: if not only the winning neuron but also its neighbors on the lattice are allowed to learn, neighboring neurons will gradually specialize to represent similar inputs, and the representations will become ordered on the map lattice. This is the essence of the SOM algorithm.
The neurons represent the inputs with reference vectors ,
the components of which correspond to synaptic weights. One reference
vector is associated with each neuron called unit in a more
abstract setting. The unit, indexed with c, whose reference vector
is nearest to the input
is the winner of the competition:
Usually Euclidean metric is used, although other choices are possible as well.
The winning unit and its neighbors adapt to represent the input even
better by modifying their reference vectors towards the current input.
The amount the units learn will be governed by a neighborhood kernel
h, which is a decreasing function of the distance of the units from
the winning unit on the map lattice. If the locations of units i and
j on the map grid are denoted by the two-dimensional vectors and
, respectively, then
, where t denotes time.
During the learning process at time t the reference vectors are changed
iteratively according to the following adaptation rule, where is the input at time t and
is the index
of the winning unit:
In practice the neighborhood kernel is chosen to be wide in the beginning of the learning process to guarantee global ordering of the map, and both its width and height decrease slowly during learning.
The learning process consisting of winner selection by Equation 5 and adaptation of the synaptic weights by Equation 6, can be modeled with a neural network structure, in which the neurons are coupled by inhibitory connections [Kaski and Kohonen, 1994, Kohonen, 1993].