Scientific Workflows: Towards a New Synthesis for Information Integration

Bertram Ludaescher

We argue that the notion of scientific information integration (SII) should be broadened to include the emerging paradigm of scientific workflows. To the domain scientist, SII often means an "end-to-end" process, consisting of data acquisition, transformation, analysis, visualization, and other steps. We argue that SII requires a new paradigm that combines a data-oriented approach with a suitable process-oriented workflow modeling approach. Our experiences in a number of collaborative projects indicate that scientists have a real need for this extended kind of information integration. Towards this end, we introduce a number of concepts such as models of computation and provenance, actor- and flow-oriented programming, higher-order components, adapters, and hybrid types and then outline a particular combination of these, yielding a promising new synthesis of process and data modeling called Collection-Oriented Modeling and Design.