A Theory of Probabilistic Representation In Spiking Networks
Tony Bell
Redwood Neuroscience Institute
Menlo Park, CA (tbell@rni.org)
Abstract
Neurons spike, but the probabilistic learning machines known as
artificial neural networks do not. If we want to create a theoretical
neuroscience, or extend further the explanatory powers of statistical
theory, we need models for how spikes represent probability
distributions, individually and collectively. When we view the linear
`integrate-and-fire' model neuron in this light, we find that every
spike represents a hyperplane in the space--time input space of
synaptic events that caused it. A collection of spikes from several
neurons represent possibly intersecting hyperplanes, and if there are
enough of them, the input can be reconstructed exactly by inverting a
huge matrix. This is simply `coarse coding' at the spike level rather
than at the neuron level. If there are *not* enough spikes, we can (1)
include the information in the `silences' between the spikes -- this
leads us to a large-scale linear programming problem which has a
unique sparse solution. And/or we can (2) assume parameterised model
distributions on the hyperplanes -- this leads us to a very
high-dimensional form of approximate Bayesian inference related to
Generalised Belief Propagation.
It is remarkable that such ideas from the forefront of machine
learning appear in trying to understand neural spike
coding. Furthermore, a more realistic neural model with synaptic
conductance inputs, which are nonlinear, leads to corresponding forms
of non-linear programming and belief propagation on curved manifolds,
which (presumably) lie beyond current theory.
That's just to solve the representation problem (what spikes are
`saying' about their inputs). In the learning problem, because of the
need to estimate the gradient with respect to the weights of the log
partition function, an intriguing connection opens up with Hinton's
theory of Contrastive Divergence. Basically, in order to know how to
change our brains, we have to watch and dream at the same time. That's
for the feedforward density estimation theory, which of course is
wrong because the brain is in a loop with the world. Some thoughts on
this latter case will be presented (if there's time).
tbell@rni.org or tony@salk.edu, 650-321-8282 x238