Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/13985
Title: Stubby: a transformation-based optimizer for MapReduce workflows
Authors: Herodotou, Herodotos 
Babu, Shivnath 
Lim, Harold 
Major Field of Science: Engineering and Technology
Field Category: Electrical Engineering - Electronic Engineering - Information Engineering
Keywords: Cost-based optimization;Desirable features;Execution plans;Large datasets;Optimizers;Search Algorithms;Space-based;Transformation based
Issue Date: Jul-2012
Source: Journal Proceedings of the VLDB Endowment, 2012, vol.5, no.11, pp. 1196-1207
Volume: 5
Issue: 11
Start page: 1196
End page: 1207
Journal: Journal Proceedings of the VLDB Endowment
Abstract: There is a growing trend of performing analysis on large datasets using workflows composed of MapReduce jobs connected through producer-consumer relationships based on data. This trend has spurred the development of a number of interfaces-ranging from program-based to query-based interfaces-for generating MapReduce workflows. Studies have shown that the gap in performance can be quite large between optimized and unoptimized workflows. However, automatic cost-based optimization of MapReduce workflows remains a challenge due to the multitude of interfaces, large size of the execution plan space, and the frequent unavailability of all types of information needed for optimization. We introduce a comprehensive plan space for MapReduce workflows generated by popular workflow generators. We then propose Stubby, a cost-based optimizer that searches selectively through the subspace of the full plan space that can be enumerated correctly and costed based on the information available in any given setting. Stubby enumerates the plan space based on plan-to-plan transformations and an efficient search algorithm. Stubby is designed to be extensible to new interfaces and new types of optimizations, which is a desirable feature given how rapidly MapReduce systems are evolving. Stubby's efficiency and effectiveness have been evaluated using representative workflows from many domains. © 2012 VLDB Endowment.
ISSN: 21508097
DOI: 10.14778/2350229.2350239
Rights: © VLDB Endowment
Type: Article
Affiliation : Duke University 
Publication Type: Peer Reviewed
Appears in Collections:Άρθρα/Articles

CORE Recommender
Show full item record

SCOPUSTM   
Citations

61
checked on Mar 14, 2024

Page view(s)

350
Last Week
0
Last month
1
checked on Nov 21, 2024

Google ScholarTM

Check

Altmetric


Items in KTISIS are protected by copyright, with all rights reserved, unless otherwise indicated.