<efrbr:recordSet xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:efrbr="http://vfrbr.info/efrbr/1.1" xmlns:efrbr-work="http://vfrbr.info/efrbr/1.1/work" xmlns:efrbr-expression="http://vfrbr.info/efrbr/1.1/expression" xmlns:efrbr-manifestation="http://vfrbr.info/efrbr/1.1/manifestation" xmlns:efrbr-person="http://vfrbr.info/efrbr/1.1/person" xmlns:efrbr-corporateBody="http://vfrbr.info/efrbr/1.1/corporateBody" xmlns:efrbr-concept="http://vfrbr.info/efrbr/1.1/concept" xmlns:efrbr-structure="http://vfrbr.info/efrbr/1.1/structure" xmlns:efrbr-responsible="http://vfrbr.info/efrbr/1.1/responsible" xmlns:efrbr-subject="http://vfrbr.info/efrbr/1.1/subject" xmlns:efrbr-other="http://vfrbr.info/efrbr/1.1/other" xsi:schemaLocation="http://vfrbr.info/efrbr/1.1 http://vfrbr.info/schemas/1.1/efrbr.xsd"><efrbr:entities><efrbr-work:work identifier="http://purl.tuc.gr/dl/dias/5507F55A-78BD-4114-A824-3CB697C32ED1"><efrbr-work:titleOfTheWork>SPARTAN: A model-based semantic compression system for massive data tables</efrbr-work:titleOfTheWork></efrbr-work:work><efrbr-expression:expression identifier="http://purl.tuc.gr/dl/dias/5507F55A-78BD-4114-A824-3CB697C32ED1"><efrbr-expression:titleOfTheExpression>SPARTAN: A model-based semantic compression system for massive data tables</efrbr-expression:titleOfTheExpression><efrbr-expression:formOfExpression vocabulary="DIAS:TYPES">
            Πλήρης Δημοσίευση σε Συνέδριο
            Conference Full Paper
         </efrbr-expression:formOfExpression><efrbr-expression:dateOfExpression type="issued">2015-12-01</efrbr-expression:dateOfExpression><efrbr-expression:dateOfExpression type="published">2001</efrbr-expression:dateOfExpression><efrbr-expression:languageOfExpression vocabulary="iso639-1">en</efrbr-expression:languageOfExpression><efrbr-expression:summarizationOfContent>While a variety of lossy compression schemes have been developed for certain
forms of digital data (e.g., images, audio, video), the area of lossy
compression techniques for arbitrary data tables has been left relatively unexplored.
Nevertheless, such techniques are clearly motivated by the everincreasing
data collection rates of modern enterprises and the need for effective,
guaranteed-quality approximate answers to queries over massive
relational data sets. In this paper, we propose SPARTAN , a system that
takes advantage of attribute semantics and data-mining models to perform
lossy compression of massive data tables. SPARTAN is based on the
novel idea of exploiting predictive data correlations and prescribed error
tolerances for individual attributes to construct concise and accurate Classification
and Regression Tree (CaRT) models for entire columns of a table.
More precisely, SPARTAN selects a certain subset of attributes for
which no values are explicitly stored in the compressed table; instead, concise
CaRTs that predict these values (within the prescribed error bounds)
are maintained. To restrict the huge search space and construction cost
of possible CaRT predictors, SPARTAN employs sophisticated learning
techniques and novel combinatorial optimization algorithms. Our experimentation
with several real-life data sets offers convincing evidence of the
effectiveness of SPARTAN ’s model-based approach – SPARTAN is
able to consistently yield substantially better compression ratios than existing
semantic or syntactic compression tools (e.g., gzip) while utilizing only
small data samples for model inference.</efrbr-expression:summarizationOfContent><efrbr-expression:useRestrictionsOnTheExpression type="creative-commons">http://creativecommons.org/licenses/by/4.0/</efrbr-expression:useRestrictionsOnTheExpression><efrbr-expression:note type="page range">283-294</efrbr-expression:note><efrbr-expression:note type="conference name"> International Conference on Management of Data (SIGMOD 2001)</efrbr-expression:note><efrbr-expression:note type="proceedings title">Proceedings of ACM SIGMOD'2001</efrbr-expression:note></efrbr-expression:expression><efrbr-person:person identifier="1FC2784A-9F5B-440F-B197-AD27EB09E5D4"><efrbr-person:nameOfPerson vocabulary="">
            Babu Shivnath
         </efrbr-person:nameOfPerson></efrbr-person:person><efrbr-person:person identifier="http://users.isc.tuc.gr/~mgarofalakis"><efrbr-person:nameOfPerson vocabulary="TUC:LDAP">
            Garofalakis Minos
            Γαροφαλακης Μινως
         </efrbr-person:nameOfPerson></efrbr-person:person><efrbr-person:person identifier="8D70FCDA-5C54-4CA8-A3F4-A6973CF0E8EF"><efrbr-person:nameOfPerson vocabulary="">
            Rastogi Rajeev
         </efrbr-person:nameOfPerson></efrbr-person:person><efrbr-corporateBody:corporateBody identifier="http://www.acm.org/"><efrbr-corporateBody:nameOfTheCorporateBody vocabulary="S/R:PUBLISHERS">
            Association for Computing Machinery
         </efrbr-corporateBody:nameOfTheCorporateBody></efrbr-corporateBody:corporateBody><efrbr-concept:concept identifier="A7E74B8E-EE45-4BFA-A649-E0C4262C290B"><efrbr-concept:termForTheConcept>
            Data management
         </efrbr-concept:termForTheConcept></efrbr-concept:concept><efrbr-concept:concept identifier="117C38FE-B8E4-4D05-A2EB-D25C78941465"><efrbr-concept:termForTheConcept>
            Data mining
         </efrbr-concept:termForTheConcept></efrbr-concept:concept></efrbr:entities><efrbr:relationships><efrbr-structure:structureRelations><efrbr-structure:realizedThrough sourceEntity="work" sourceURI="http://purl.tuc.gr/dl/dias/5507F55A-78BD-4114-A824-3CB697C32ED1" targetEntity="expression" targetURI="http://purl.tuc.gr/dl/dias/5507F55A-78BD-4114-A824-3CB697C32ED1"/></efrbr-structure:structureRelations><efrbr-responsible:responsibleRelations><efrbr-responsible:createdBy sourceEntity="work" sourceURI="http://purl.tuc.gr/dl/dias/5507F55A-78BD-4114-A824-3CB697C32ED1" targetEntity="person" targetURI="1FC2784A-9F5B-440F-B197-AD27EB09E5D4"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/5507F55A-78BD-4114-A824-3CB697C32ED1" targetEntity="person" targetURI="1FC2784A-9F5B-440F-B197-AD27EB09E5D4" role="author"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/5507F55A-78BD-4114-A824-3CB697C32ED1" targetEntity="person" targetURI="http://users.isc.tuc.gr/~mgarofalakis" role="author"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/5507F55A-78BD-4114-A824-3CB697C32ED1" targetEntity="person" targetURI="8D70FCDA-5C54-4CA8-A3F4-A6973CF0E8EF" role="author"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/5507F55A-78BD-4114-A824-3CB697C32ED1" targetEntity="person" targetURI="http://www.acm.org/" role="publisher"/></efrbr-responsible:responsibleRelations><efrbr-subject:subjectRelations><efrbr-subject:hasSubject sourceEntity="work" sourceURI="http://purl.tuc.gr/dl/dias/5507F55A-78BD-4114-A824-3CB697C32ED1" targetEntity="concept" targetURI="A7E74B8E-EE45-4BFA-A649-E0C4262C290B"/><efrbr-subject:hasSubject sourceEntity="work" sourceURI="http://purl.tuc.gr/dl/dias/5507F55A-78BD-4114-A824-3CB697C32ED1" targetEntity="concept" targetURI="117C38FE-B8E4-4D05-A2EB-D25C78941465"/></efrbr-subject:subjectRelations><efrbr-other:otherRelations/></efrbr:relationships></efrbr:recordSet>