Institutional Repository [SANDBOX]
Technical University of Crete
EN  |  EL

Search

Browse

My Space

Geometric monitoring of heterogeneous streams

Keren Daniel, Sagy Guy, Abboud Amir, Ben-David David, Schuster Assaf, Sharfman Izchak, Deligiannakis Antonios

Full record


URI: http://purl.tuc.gr/dl/dias/1130B174-AF1E-4AD0-9BD6-85220ED02BC3
Year 2014
Type of Item Peer-Reviewed Journal Publication
License
Details
Bibliographic Citation D. Keren, G. Sagy, A. Abboud, D. Ben-David, A. Schuster, I. Sharfman and A. Deligiannakis, "Geometric monitoring of heterogeneous streams," IEEE Trans. Knowl. Data Eng., vol. 26, no. 8, pp. 1890-1903, Aug. 2014. doi:10.1109/TKDE.2013.180 https://doi.org/10.1109/TKDE.2013.180
Appears in Collections

Summary

Interest in stream monitoring is shifting toward the distributed case. In many applications the data is high volume, dynamic, and distributed, making it infeasible to collect the distinct streams to a central node for processing. Often, the monitoring problem consists of determining whether the value of a global function, defined on the union of all streams, crossed a certain threshold. We wish to reduce communication by transforming the global monitoring to the testing of local constraints, checked independently at the nodes. Geometric monitoring (GM) proved useful for constructing such local constraints for general functions. Alas, in GM the constraints at all nodes share an identical structure and are thus unsuitable for handling heterogeneous streams. Therefore, we propose a general approach for monitoring heterogeneous streams (HGM), which defines constraints tailored to fit the data distributions at the nodes. While we prove that optimally selecting the constraints is NP-hard, we provide a practical solution, which reduces the running time by hierarchically clustering nodes with similar data distributions and then solving simpler optimization problems. We also present a method for efficiently recovering from local violations at the nodes. Experiments yield an improvement of over an order of magnitude in communication relative to GM.

Services

Statistics