A Survey on Automatic Parameter Tuning for Big Data Processing Systems

Herodotou, Herodotos; Chen, Yuxing; Lu, Jiaheng

doi:10.1145/3381027

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/19324

DC Field	Value	Language
dc.contributor.author	Herodotou, Herodotos	-
dc.contributor.author	Chen, Yuxing	-
dc.contributor.author	Lu, Jiaheng	-
dc.date.accessioned	2020-11-05T08:55:05Z	-
dc.date.available	2020-11-05T08:55:05Z	-
dc.date.issued	2020-06	-
dc.identifier.citation	ACM Computing Surveys, 2020, vol. 53, no 2, articl. no. 3381027	en_US
dc.identifier.issn	15577341	-
dc.identifier.uri	https://hdl.handle.net/20.500.14279/19324	-
dc.description.abstract	Big data processing systems (e.g., Hadoop, Spark, Storm) contain a vast number of configuration parameters controlling parallelism, I/O behavior, memory settings, and compression. Improper parameter settings can cause significant performance degradation and stability issues. However, regular users and even expert administrators grapple with understanding and tuning them to achieve good performance. We investigate existing approaches on parameter tuning for both batch and stream data processing systems and classify them into six categories: rule-based, cost modeling, simulation-based, experiment-driven, machine learning, and adaptive tuning. We summarize the pros and cons of each approach and raise some open research problems for automatic parameter tuning.	en_US
dc.format	pdf	en_US
dc.language.iso	en	en_US
dc.relation.ispartof	ACM Computing Surveys	en_US
dc.rights	© owner/author(s).	en_US
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	MapReduce	en_US
dc.subject	Parameter tuning	en_US
dc.subject	Self-tuning	en_US
dc.subject	Spark	en_US
dc.subject	Storm	en_US
dc.subject	Stream	en_US
dc.title	A Survey on Automatic Parameter Tuning for Big Data Processing Systems	en_US
dc.type	Article	en_US
dc.collaboration	Cyprus University of Technology	en_US
dc.collaboration	University of Helsinki	en_US
dc.subject.category	Computer and Information Sciences	en_US
dc.journals	Open Access	en_US
dc.country	Cyprus	en_US
dc.country	Finland	en_US
dc.subject.field	Natural Sciences	en_US
dc.publication	Peer Reviewed	en_US
dc.identifier.doi	10.1145/3381027	en_US
dc.relation.issue	2	en_US
dc.relation.volume	53	en_US
cut.common.academicyear	2019-2020	en_US
item.fulltext	With Fulltext	-
item.openairecristype	http://purl.org/coar/resource_type/c_6501	-
item.openairetype	article	-
item.grantfulltext	open	-
item.languageiso639-1	en	-
item.cerifentitytype	Publications	-
crisitem.journal.journalissn	1557-7341	-
crisitem.journal.publisher	Association for Computing Machinery	-
crisitem.author.dept	Department of Electrical Engineering, Computer Engineering and Informatics	-
crisitem.author.faculty	Faculty of Engineering and Technology	-
crisitem.author.orcid	0000-0002-8717-1691	-
crisitem.author.parentorg	Faculty of Engineering and Technology	-
Appears in Collections:	Άρθρα/Articles

Files in This Item:

File	Description	Size	Format
3381027.pdf	Fulltext	900.91 kB	Adobe PDF	View/Open

CORE Recommender

Show simple item record

SCOPUS^TM
Citations

54

checked on Nov 6, 2023

WEB OF SCIENCE^TM
Citations

40

Last Week
0

Last month
4

checked on Oct 26, 2023

Page view(s)

318

Last Week
0

Last month
6

checked on Feb 3, 2025

Download(s)

223

checked on Feb 3, 2025

Google Scholar^TM

Check

Altmetric

This item is licensed under a Creative Commons License