Optimal histograms for limiting worst-case error propagation in the size of join results.

Christodoulakis Stavros, Yiannis E. Ioannidis

Πλήρης Εγγραφή

URI:

http://purl.tuc.gr/dl/dias/A0E77513-A9A3-47D9-9553-601A84B1B522

Έτος

1993

Τύπος

Δημοσίευση σε Περιοδικό με Κριτές

Άδεια Χρήσης

Λεπτομέρειες

Βιβλιογραφική Αναφορά

Y. Ioannidis , S. Christodoulakis , " Optimal histograms for limiting worst-case error propagation in the size of join results",ACM Trans. on Datab. Syst. , vol. 18 ,no. 4 ,pp. 709-748,1993 .doi : 10.1145/169725.169708 https://doi.org/10.1145/169725.169708

Εμφανίζεται στις Συλλογές

Δημοσιεύσεις σε Περιοδικά στην Κοινότητα Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών

Δημοσιεύσεις σε Περιοδικά στην Κοινότητα Εργαστήριο Διανεμημένων Πληροφοριακών Συστημάτων και Εφαρμογών

Περίληψη

Many current relational database systems use some form of histograms to approximate the frequency distribution of values in the attributes of relations and on this basis estimate query result sizes and access plan costs. The errors that exist in the histogram approximations directly or transitively affect many estimates derived by the database system. We identify the class of serial histograms and its subclass of end-btased histograms; the latter is of particular interest because such histograms are used in several database systems. We concentrate on equality join queries without function symbols where each relation is joined on the same attribute(s) for all joins in which it participates. Join queries of this restricted type are called t-cllque queries. We show that the optimal histogram for reducing the worst-case error in the result size of such a query is always serial. For queries with one join and no function symbols (all of which are vacuously t-clique queries), we present results on finding the optimal serial histogram and the optimal end-biased histogram based on the query characteristics and the frequency distributions of values in the join attributes of the query relations. Finally, we prove that for t-clique queries with a very large number of joins, h~gh-bzased h zstograms (which form a subclass of end-biased histograms) are always optimal. To construct a histogram for the join attribute(s) of a relation, the values in the attribute(s) must first be sorted based on their frequency and then assigned into buckets according to the optimality results above.

Αναζήτηση

Πλοήγηση

Ο Χώρος μου

Optimal histograms for limiting worst-case error propagation in the size of join results.

Christodoulakis Stavros, Yiannis E. Ioannidis

Περίληψη

Διαθέσιμα αρχεία

Υπηρεσίες

Εξαγωγή

Κοινοποίηση

Στατιστικά

Μεταδεδομένων & Περιεχομένου σε METS:

Μεταδεδομένων σε Μορφότυπο: