The Neyman Seminar: 1011 Evans, 4:10-5:00 pm Wednesday, September 3, 2003

Weighing Evidence In the Absence of A Gold Standard

Phil Long

Genome Institute of Singapore

Abstract

Many propositions in Biology can be supported by a variety of different sorts of evidence. It is often useful to collect together large numbers of such propositions, together with the evidence supporting them, into databases to be used in other analyses. In this case, it becomes necessary to automatically decide which propositions are supported well enough to be included. This can involve weighing evidence of varying strength. In some important cases, there is effectively no definitive source of evidence, no "gold standard," that can be used for evaluating the others. Examples of problems with this property are (1) pairing up equivalent genes between species, and (2) predicting which pairs of proteins interact.
In this talk, I will describe research into methodology for problems of this sort. I will describe a method derived analytically from a probabilistic model of the problem, and an approximation necessary to make it practical. Then I will describe simulation results on artificial data, and evaluation of the method through its application to mapping orthologs.
(This is joint work with K.R.K. Murthy, Vinsensius Vega, Nir Friedman, and Edison Liu.)