Ιδρυματικό Αποθετήριο [SANDBOX]
Πολυτεχνείο Κρήτης
EN  |  EL

Αναζήτηση

Πλοήγηση

Ο Χώρος μου

Probabilistic declarative information extraction

Wang Daisy Zhe, Michelakis Eirinaios, Franklin Michael J., Garofalakis Minos, Hellerstein, Joseph, 1952-

Πλήρης Εγγραφή


URI: http://purl.tuc.gr/dl/dias/C8A10160-E770-48D3-8503-C457D55AADE8
Έτος 2010
Τύπος Πλήρης Δημοσίευση σε Συνέδριο
Άδεια Χρήσης
Λεπτομέρειες
Βιβλιογραφική Αναφορά D. Z. Wang, E. Michelakis, M. J. Franklin, M. Garofalakis and J. M. Hellerstein, "Probabilistic declarative information extraction", in 26th IEEE International Conference on Data Engineering, 2010.
Εμφανίζεται στις Συλλογές

Περίληψη

Unstructured text represents a large fraction of theworld’s data. It often contains snippets of structured information(e.g., people’s names and zip codes). Information Extraction(IE) techniques identify such structured information in text. Inrecent years, database research has pursued IE on two fronts:declarative languages and systems for managing IE tasks, andprobabilistic databases for querying the output of IE. In thispaper, we make the first step to merge these two directions,without loss of statistical robustness, by implementing a state-ofthe-artstatistical IE model – Conditional Random Fields (CRF)– in the setting of a Probabilistic Database that treats statisticalmodels as first-class data objects. We show that the Viterbialgorithm for CRF inference can be specified declaratively inrecursive SQL. We also show the performance benefits relativeto a standalone open-source Viterbi implementation. This workopens up the optimization opportunities for queries involvingboth inference and relational operators over IE models.

Υπηρεσίες

Στατιστικά