Temporal constraints on visual learning: a computational model.
Stone JV., Harper N.
Given a constant stream of perceptual stimuli, how can the underlying invariances associated with a given input be learned? One approach consists of using generic truths about the spatiotemporal structure of the physical world as constraints on the types of quantities learned. The learning methodology employed here embodies one such truth: that perceptually salient properties (such as stereo disparity) tend to vary smoothly over time. Unfortunately, the units of an artificial neural network tend to encode superficial image properties, such as individual grey-level pixel values, which vary rapidly over time. However, if the states of units are constrained to vary slowly, then the network is forced to learn a smoothly varying function of the training data. We implemented this temporal-smoothness constraint in a backpropagation network which learned stereo disparity from random-dot stereograms. Temporal smoothness was formalized with the use of regularization theory by modifying the standard cost function minimised during training of a network. Temporal smoothness was found to be similar to other techniques for improving generalisation, such as early stopping and weight decay. However, in contrast to these, the theoretical underpinnings of temporal smoothing are intimately related to fundamental characteristics of the physical world. Results are discussed in terms of regularization theory and the physically realistic assumptions upon which temporal smoothing is based.