Querying probabilistic information extraction

Wang Daisy Zhe, Franklin Michael J., Garofalakis Minos, Hellerstein, Joseph, 1952-

Πλήρης Εγγραφή

URI:

http://purl.tuc.gr/dl/dias/9AC9C1B2-435D-4E49-971E-4D808FF2D7C6

Έτος

2010

Τύπος

Πλήρης Δημοσίευση σε Συνέδριο

Άδεια Χρήσης

Λεπτομέρειες

Βιβλιογραφική Αναφορά

D. Z. Wang, M. J. Franklin, M. Garofalakis and J. M. Hellerstein, "Querying probabilistic information extraction", in 36th International Conference on Very Large Data Bases, 2010. https://doi.org/10.14778/1920841.1920974

Εμφανίζεται στις Συλλογές

Δημοσιεύσεις σε Συνέδρια στην Κοινότητα Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών

Δημοσιεύσεις σε Συνέδρια στην Κοινότητα Εργαστήριο Τεχνολογίας Συστημάτων Λογισμικού και Δικτυακών Εφαρμογών

Περίληψη

Recently, there has been increasing interest in extending relationalquery processing to include data obtained from unstructured sources.A common approach is to use stand-alone Information Extraction(IE) techniques to identify and label entities within blocks of text;the resulting entities are then imported into a standard database andprocessed using relational queries. This two-part approach, however,suffers from two main drawbacks. First, IE is inherently probabilistic,but traditional query processing does not properly handleprobabilistic data, resulting in reduced answer quality. Second,performance inefficiencies arise due to the separation of IE fromquery processing. In this paper, we address these two problems bybuilding on an in-database implementation of a leading IE model—Conditional Random Fields using the Viterbi inference algorithm.We develop two different query approaches on top of this implementation.The first uses deterministic queries over maximumlikelihoodextractions, with optimizations to push the relational operatorsinto the Viterbi algorithm. The second extends the Viterbialgorithm to produce a set of possible extraction “worlds”, fromwhich we compute top-k probabilistic query answers. We describethese approaches and explore the trade-offs of efficiency and effectivenessbetween them using two datasets

Αναζήτηση

Πλοήγηση

Ο Χώρος μου

Querying probabilistic information extraction

Wang Daisy Zhe, Franklin Michael J., Garofalakis Minos, Hellerstein, Joseph, 1952-

Περίληψη

Υπηρεσίες

Εξαγωγή

Κοινοποίηση

Στατιστικά

Μεταδεδομένων & Περιεχομένου σε METS:

Μεταδεδομένων σε Μορφότυπο: