Software Demo Session

Monday, August 16th


HDBStat! : An integrated software suite for statistical analysis of high dimensional biology data
David Allison and Prinal Trivedi
University of Alabama at Birmingham

HDBStat!, which stands for High-Dimensional Biology-Statistics, is a software package designed for analysis of high dimensional biology data. HDBStat! was initially developed for analysis of microarray gene expression data, but can also be used for some applications in proteomics and other aspects of genomics. HDBStat! allows researchers to analyze complex microarray data using data preprocessing methods such as Quantile-Quantile normalization, Chip-Mean normalization, Linear Model Normalization, various log transformations; quality control method such as Deleted Residuals;
hypothesis testing methods such as Equal and Unequal Variance t-tests, Bootstrap, Chebby Checker; multiplicity control methods such as Bonferroni, Sidak, False Discovery Rate, Mix-o-matic; and estimation methods such as Empirical Bayes Estimates.  These methods take into account non-normal data and small sample sizes.  Other features of HDBStat! include platform-independent Java implementation, and flexible, easy-to-use interface.  This software is freely available to academic institutions and non-profit organizations at http://www.soph.uab.edu/ssg_content.asp?id=1164.


The Statistical Reality Engine: A Collaborative Environment for Data Analysis
Richard Boyce and A.J. Rossini

Virtual Reality has underperformed its promise for data analysis.  We present a prototype of a system based on commodity tools, systems, and hardware.  This prototype, while incomplete, has been used for needs assessment for our second generation data analysis environment.  It can be easily deployed on a single machine, in a cluster of machines, or a fully immersive VR environment, and employs both command-line functionality and low-cost I/O devices including joysticks, gamepads, and "gaming gloves" for interfacing.  While the system fails to provide some critical functionality, it is usable as a simple extension of the RGL OpenGL package for R.  One of the target applications is collaborative data analysis in the bioinformatics domain.


The TM4 System for DNA Microarray Analysis
John Quackenbush
The Institute for Genomic Research

TM4 is a comprehensive, open source, platform independent system for collecting, managing, and analyzing microarray data.  The system consists of four primary applications as well as a set of ancillary and other derived tools. MADAM is a java-based data entry and management system with an intiutive graphical user interface built on top of a MIAME-compliant MySQL database. MADAM is the first freely-available database system capable of exporting data in MAGE-ML. Spotfinder is an image processing tool written in platform-independent c/c++ that provides a range of quailty control assessments for microarray images. MIDAS is a data normalization and filtering tool designed with a graphical scripting interface that allows the creation of a complex analysis pipeline. MeV is a data mining and statistical analysis tool that provides access to more than twenty sophisticated algorithms and allowsusers to interact with and compare the results from a range of individual analyses. All software is provided with source code under the Artistic License with a well-defined API and many groups have contributed analysis modules or used modules to construct still other tools. The goal of this demo will be to highlight the capabilities of the various tools, with a focus on MeV.