Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/13985
DC FieldValueLanguage
dc.contributor.authorHerodotou, Herodotos-
dc.contributor.authorBabu, Shivnath-
dc.contributor.authorLim, Harold-
dc.date.accessioned2019-05-31T10:40:19Z-
dc.date.available2019-05-31T10:40:19Z-
dc.date.issued2012-07-
dc.identifier.citationJournal Proceedings of the VLDB Endowment, 2012, vol.5, no.11, pp. 1196-1207en_US
dc.identifier.issn21508097-
dc.description.abstractThere is a growing trend of performing analysis on large datasets using workflows composed of MapReduce jobs connected through producer-consumer relationships based on data. This trend has spurred the development of a number of interfaces-ranging from program-based to query-based interfaces-for generating MapReduce workflows. Studies have shown that the gap in performance can be quite large between optimized and unoptimized workflows. However, automatic cost-based optimization of MapReduce workflows remains a challenge due to the multitude of interfaces, large size of the execution plan space, and the frequent unavailability of all types of information needed for optimization. We introduce a comprehensive plan space for MapReduce workflows generated by popular workflow generators. We then propose Stubby, a cost-based optimizer that searches selectively through the subspace of the full plan space that can be enumerated correctly and costed based on the information available in any given setting. Stubby enumerates the plan space based on plan-to-plan transformations and an efficient search algorithm. Stubby is designed to be extensible to new interfaces and new types of optimizations, which is a desirable feature given how rapidly MapReduce systems are evolving. Stubby's efficiency and effectiveness have been evaluated using representative workflows from many domains. © 2012 VLDB Endowment.en_US
dc.formatpdfen_US
dc.language.isoenen_US
dc.relation.ispartofJournal Proceedings of the VLDB Endowmenten_US
dc.rights© VLDB Endowmenten_US
dc.subjectCost-based optimizationen_US
dc.subjectDesirable featuresen_US
dc.subjectExecution plansen_US
dc.subjectLarge datasetsen_US
dc.subjectOptimizersen_US
dc.subjectSearch Algorithmsen_US
dc.subjectSpace-baseden_US
dc.subjectTransformation baseden_US
dc.titleStubby: a transformation-based optimizer for MapReduce workflowsen_US
dc.typeArticleen_US
dc.collaborationDuke Universityen_US
dc.subject.categoryElectrical Engineering - Electronic Engineering - Information Engineeringen_US
dc.countryUnited Statesen_US
dc.subject.fieldEngineering and Technologyen_US
dc.publicationPeer Revieweden_US
dc.identifier.doi10.14778/2350229.2350239en_US
dc.identifier.scopus2-s2.0-84873155673en
dc.identifier.urlhttps://api.elsevier.com/content/abstract/scopus_id/84873155673en
dc.contributor.orcid#NODATA#en
dc.contributor.orcid#NODATA#en
dc.contributor.orcid#NODATA#en
dc.relation.issue11en_US
dc.relation.volume5en_US
cut.common.academicyear2011-2012en_US
dc.identifier.spage1196en_US
dc.identifier.epage1207en_US
item.languageiso639-1en-
item.openairecristypehttp://purl.org/coar/resource_type/c_6501-
item.fulltextNo Fulltext-
item.grantfulltextnone-
item.openairetypearticle-
item.cerifentitytypePublications-
crisitem.author.deptDepartment of Electrical Engineering, Computer Engineering and Informatics-
crisitem.author.facultyFaculty of Engineering and Technology-
crisitem.author.orcid0000-0002-8717-1691-
crisitem.author.parentorgFaculty of Engineering and Technology-
Appears in Collections:Άρθρα/Articles
CORE Recommender
Show simple item record

SCOPUSTM   
Citations

61
checked on Mar 14, 2024

Page view(s)

335
Last Week
2
Last month
6
checked on Jul 25, 2024

Google ScholarTM

Check

Altmetric


Items in KTISIS are protected by copyright, with all rights reserved, unless otherwise indicated.