Weighing Evidence In the Absence of A Gold Standard
Phil Long
Genome Institute of Singapore
Abstract
Many propositions in Biology can be supported by a variety of
different sorts of evidence. It is often useful to collect together
large numbers of such propositions, together with the evidence
supporting them, into databases to be used in other analyses. In this
case, it becomes necessary to automatically decide which propositions
are supported well enough to be included. This can involve weighing
evidence of varying strength. In some important cases, there is
effectively no definitive source of evidence, no "gold standard," that
can be used for evaluating the others. Examples of problems with this
property are (1) pairing up equivalent genes between species, and (2)
predicting which pairs of proteins interact.
In this talk, I will describe research into methodology for problems of
this sort. I will describe a method derived analytically from a
probabilistic model of the problem, and an approximation necessary to
make it practical. Then I will describe simulation results on
artificial data, and evaluation of the method through its application to
mapping orthologs.
(This is joint work with K.R.K. Murthy, Vinsensius Vega, Nir Friedman,
and Edison Liu.)