Some local minima can be avoided by sometimes resetting the neuron
activities. It is called ``rebooting''. Time independent
parameters and weights are left as they were but sources
sand
u are set to their means:
(6.6) | |||
(6.7) |
If there is lots of data from the same data set available, this is a good opportunity to change the data. It can be better to iterate more with less data than the other way around. Switching the data samples from time to time can still preserve some of the benefits of the larger data set. A network that performs well with new data, has a good generalisation capability by definition.