The Neyman Seminar: 1011 Evans, 4:10-5:00 pm Wednesday, September 10, 2003

A Theory of Probabilistic Representation In Spiking Networks

Tony Bell

Redwood Neuroscience Institute
Menlo Park, CA (tbell@rni.org)

Abstract

Neurons spike, but the probabilistic learning machines known as artificial neural networks do not. If we want to create a theoretical neuroscience, or extend further the explanatory powers of statistical theory, we need models for how spikes represent probability distributions, individually and collectively. When we view the linear `integrate-and-fire' model neuron in this light, we find that every spike represents a hyperplane in the space--time input space of synaptic events that caused it. A collection of spikes from several neurons represent possibly intersecting hyperplanes, and if there are enough of them, the input can be reconstructed exactly by inverting a huge matrix. This is simply `coarse coding' at the spike level rather than at the neuron level. If there are *not* enough spikes, we can (1) include the information in the `silences' between the spikes -- this leads us to a large-scale linear programming problem which has a unique sparse solution. And/or we can (2) assume parameterised model distributions on the hyperplanes -- this leads us to a very high-dimensional form of approximate Bayesian inference related to Generalised Belief Propagation.

It is remarkable that such ideas from the forefront of machine learning appear in trying to understand neural spike coding. Furthermore, a more realistic neural model with synaptic conductance inputs, which are nonlinear, leads to corresponding forms of non-linear programming and belief propagation on curved manifolds, which (presumably) lie beyond current theory.

That's just to solve the representation problem (what spikes are `saying' about their inputs). In the learning problem, because of the need to estimate the gradient with respect to the weights of the log partition function, an intriguing connection opens up with Hinton's theory of Contrastive Divergence. Basically, in order to know how to change our brains, we have to watch and dream at the same time. That's for the feedforward density estimation theory, which of course is wrong because the brain is in a loop with the world. Some thoughts on this latter case will be presented (if there's time).

tbell@rni.org or tony@salk.edu, 650-321-8282 x238