URI | http://purl.tuc.gr/dl/dias/9FBD290C-DC9B-419A-9A80-E532CB9521E0 | - |
Identifier | http://dl.acm.org/citation.cfm?id=1989323.1989378 | - |
Identifier | https://doi.org/10.1145/1989323.1989378 | - |
Language | en | - |
Extent | 12 pages | en |
Title | Hybrid in-database inference for declarative information extraction | en |
Creator | Wang Daisy Zhe | en |
Creator | Franklin Michael J. | en |
Creator | Garofalakis Minos | en |
Creator | Γαροφαλακης Μινως | el |
Creator | Hellerstein, Joseph, 1952- | en |
Creator | Wick Michael L. | en |
Publisher | Association for Computing Machinery | en |
Content Summary | In the database community, work on information extraction (IE)
has centered on two themes: how to effectively manage IE tasks,
and how to manage the uncertainties that arise in the IE process
in a scalable manner. Recent work has proposed a probabilistic
database (PDB) based declarative IE system that supports a leading
statistical IE model, and an associated inference algorithm to
answer top-k-style queries over the probabilistic IE outcome. Still,
the broader problem of effectively supporting general probabilistic
inference inside a PDB-based declarative IE system remains
open. In this paper, we explore the in-database implementations of
a wide variety of inference algorithms suited to IE, including two
Markov chain Monte Carlo algorithms, Viterbi and sum-product algorithms.
We describe the rules for choosing appropriate inference
algorithms based on the model, the query and the text, considering
the trade-off between accuracy and runtime. Based on these rules,
we describe a hybrid approach to optimize the execution of a single
probabilistic IE query to employ different inference algorithms
appropriate for different records. We show that our techniques can
achieve up to 10-fold speedups compared to the non-hybrid solutions
proposed in the literature. | en |
Type of Item | Πλήρης Δημοσίευση σε Συνέδριο | el |
Type of Item | Conference Full Paper | en |
License | http://creativecommons.org/licenses/by/4.0/ | en |
Date of Item | 2015-11-30 | - |
Date of Publication | 2011 | - |
Subject | Database management | en |
Subject | Mathematics of computing | en |
Bibliographic Citation | D. Z. Wang, M. J. Franklin, M. Garofalakis, J. M. Hellerstein and M. L. Wick, "Hybrid in-database inference for declarative information extraction", in ACM SIGMOD International Conference on Management of Data, 2011, pp. 517-528. doi: 10.1145/1989323.1989378
| en |