Speed Group 
Microarray Page

Index to our site

Research

Affy

Papers/Tech. reports

Talks/Posters

Hints/Prejudices

Group Members

Support

Collaborators

Software

Links

Home - Hints & Prejudices - Always Log

Always log spot intensities and ratios

Why? Because it

  • Makes variation of intensities and ratios of intensities more  independent of absolute magnitude.
  • Makes normalization additive.
  • Evens out highly skew distributions
  • Gives a more realistic sense of variation
Comments:

Logs base 2 are the easiest for people  (intensities are typically a number between 0 to 65535).

In a context like this, a single standard deviation (SD) can be given the usual             interpretation only if 
           (a) the distribution is approximately normal, and
           (b) variation is independent of magnitude.
Neither of these is true for unlogged intensities or ratios.

Compare the histograms for unlogged and logged intensities and ratios (Figure 1).

Figure 1Figure 1:

Variation of log intensities is still not constant: a noticeable decrease as magnitude           increases remains evident in most microarray data (Figure 2).

When we have two sets of numbers such as R and G varying over a large range, it is useful to compare log R with log G by plotting their difference log(R/G) against their average  (1/2)log R*G.  Doing this we might see something unexpected. By contrast, plotting R against G is typically much less revealing and can give a quite unrealistic sense of concordance (Figure 2).

Figure 2

 
 

Further comments. As noted above, log ratios still have a spread which tends to decrease as average log intensity increases. This suggests that a different transformation of the ratio might achieve a more constant scatter (at least as a function of average log intensity). This is indeed the case: a ratio of 4th or 5th roots does seem to have a more constant scatter (about 1). However, we feel this is a small gain compared to the simplicity (adding/subtraction logs corresponding to multiplying/dividing) lost if logs are not used. For example, normalization is no longer straightforward. We will stick with logs until a more compelling reason comes along.
 


To top

last updated March 07, 2000
zarray@stat.berkeley.edu



  contact Terry Speed's
microarray data analysis group