Ιδρυματικό Αποθετήριο [SANDBOX]
Πολυτεχνείο Κρήτης
EN  |  EL

Αναζήτηση

Πλοήγηση

Ο Χώρος μου

Sharing aggregate computation for distributed queries

Huebsch Ryan, Garofalakis Minos, Hellerstein, Joseph, 1952-, Stoica, Ion, 1965-

Απλή Εγγραφή


URIhttp://purl.tuc.gr/dl/dias/8167266C-005C-4C48-A9EB-DC0302CB7517-
Αναγνωριστικόhttp://www.cs.berkeley.edu/~istoica/papers/2007/ryan-sigmod07.pdf-
Αναγνωριστικόhttps://doi.org/10.1145/1247480.1247535-
Γλώσσαen-
Μέγεθος12 pagesen
ΤίτλοςSharing aggregate computation for distributed queriesen
ΔημιουργόςHuebsch Ryanen
ΔημιουργόςGarofalakis Minosen
ΔημιουργόςΓαροφαλακης Μινωςel
ΔημιουργόςHellerstein, Joseph, 1952-en
ΔημιουργόςStoica, Ion, 1965-en
ΕκδότηςAssociation for Computing Machineryen
ΠερίληψηAn emerging challenge in modern distributed querying is to effi- ciently process multiple continuous aggregation queries simultaneously. Processing each query independently may be infeasible, so multi-query optimizations are critical for sharing work across queries. The challenge is to identify overlapping computations that may not be obvious in the queries themselves. In this paper, we reveal new opportunities for sharing work in the context of distributed aggregation queries that vary in their selection predicates. We identify settings in which a large set of q such queries can be answered by executing k ≪ q different queries. The k queries are revealed by analyzing a boolean matrix capturing the connection between data and the queries that they satisfy, in a manner akin to familiar techniques like Gaussian elimination. Indeed, we identify a class of linear aggregate functions (including SUM, COUNT and AVERAGE), and show that the sharing potential for such queries can be optimally recovered using standard matrix decompositions from computational linear algebra. For some other typical aggregation functions (including MIN and MAX) we find that optimal sharing maps to the NP-hard set basis problem. However, for those scenarios, we present a family of heuristic algorithms and demonstrate that they perform well for moderate-sized matrices. We also present a dynamic distributed system architecture to exploit sharing opportunities, and experimentally evaluate the benefits of our techniques via a novel, flexible random workload generator we develop for this setting.en
ΤύποςΠλήρης Δημοσίευση σε Συνέδριοel
ΤύποςConference Full Paperen
Άδεια Χρήσηςhttp://creativecommons.org/licenses/by/4.0/en
Ημερομηνία2015-11-30-
Ημερομηνία Δημοσίευσης2007-
Θεματική ΚατηγορίαAlgorithmsen
Θεματική ΚατηγορίαDesignen
Θεματική ΚατηγορίαMeasurementen
Βιβλιογραφική ΑναφοράR. Huebsch, M. Garofalakis, J. M. Hellerstein and I. Stoica, "Sharing aggregate computation for distributed queries", in ACM SIGMOD International Conference on Management of Data, 2007. doi: 10.1145/1247480.1247535en

Υπηρεσίες

Στατιστικά