Ιδρυματικό Αποθετήριο [SANDBOX]
Πολυτεχνείο Κρήτης
EN  |  EL

Αναζήτηση

Πλοήγηση

Ο Χώρος μου

Holistic aggregates in a networked world: distributed tracking of approximate quantiles

Cormode, Graham, 1977-, Muthukrishnan, S, Garofalakis Minos, Rastogi Rajeev

Πλήρης Εγγραφή


URI: http://purl.tuc.gr/dl/dias/494394BF-9D81-47F8-AF0F-EAA0B9094476
Έτος 2005
Τύπος Πλήρης Δημοσίευση σε Συνέδριο
Άδεια Χρήσης
Λεπτομέρειες
Βιβλιογραφική Αναφορά G. Cormode, M. Garofalakis, S. Muthukrishnan and R. Rastogi, "Holistic aggregates in a networked world: distributed tracking of approximate quantiles", in ACM SIGMOD International Conference on Management of Data, June 2005, pp. 25-36.
Εμφανίζεται στις Συλλογές

Περίληψη

While traditional database systems optimize for performance onone-shot queries, emerging large-scale monitoring applications requirecontinuous tracking of complex aggregates and data-distributionsummaries over collections of physically-distributed streams.Thus, effective solutions have to be simultaneously space efficient(at each remote site), communication efficient (across the underlyingcommunication network), and provide continuous, guaranteedqualityestimates. In this paper, we propose novel algorithmic solutionsfor the problem of continuously tracking complex holistic aggregatesin such a distributed-streams setting — our primary focusis on approximate quantile summaries, but our approach is morebroadly applicable and can handle other holistic-aggregate functions(e.g., “heavy-hitters” queries). We present the first knowndistributed-tracking schemes for maintaining accurate quantile estimateswith provable approximation guarantees, while simultaneouslyoptimizing the storage space at each remote site as well asthe communication cost across the network. In a nutshell, our algorithmsemploy a combination of local tracking at remote sites andsimple prediction models for local site behavior in order to producehighly communication- and space-efficient solutions. We performextensive experiments with real and synthetic data to explore thevarious tradeoffs and understand the role of prediction models inour schemes. The results clearly validate our approach, revealingsignificant savings over naive solutions as well as our analyticalworst-case guarantees.

Υπηρεσίες

Στατιστικά