Sketching Techniques for Tracking Distributed Data Streams
Minos Garofalakis
There is growing interest in algorithms for querying and analyzing
continuous data streams (that is, data that is seen only once and in a
fixed order) with limited memory and CPU-time resources. Such streams
arise naturally in emerging large-scale event monitoring applications;
for instance, network-operations monitoring in large ISPs, where usage
information from numerous sites needs to be continuously collected and
analyzed for interesting trends. In addition to memory- and
time-efficiency concerns, the inherently distributed nature of such
applications also raises important communication-efficiency issues,
making it critical to carefully optimize the use of the underlying
network infrastructure.
This talk will provide a short overview of key small-space pseudo-random
sketching techniques for effective query processing over streaming data,
focusing on some of my recent work on communication-efficient continuous
query tracking over distributed data streams.