Exploring System and Machine Learning Performance Interactions when Tuning Distributed Data Stream Applications

Odysseos, Lambros; Herodotou, Herodotos

doi:10.1109/ICDEW55742.2022.00008

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/29886

DC Field	Value	Language
dc.contributor.author	Odysseos, Lambros	-
dc.contributor.author	Herodotou, Herodotos	-
dc.date.accessioned	2023-07-17T10:05:06Z	-
dc.date.available	2023-07-17T10:05:06Z	-
dc.date.issued	2022-05-09	-
dc.identifier.citation	38th IEEE International Conference on Data Engineering Workshops, ICDEW 2022Virtual, Kuala Lumpur, Malaysia, 9 - 11 May 2022	en_US
dc.identifier.isbn	9781665481045	-
dc.identifier.uri	https://hdl.handle.net/20.500.14279/29886	-
dc.description.abstract	Deploying machine learning (ML) applications over distributed stream processing engines (DSPEs) such as Apache Spark Streaming is a complex procedure that requires extensive tuning along two dimensions. First, DSPEs have a vast array of system configuration parameters (such as degree of parallelism, memory buffer sizes, etc.) that need to be optimized to achieve the desired levels of latency and/or throughput. Second, each ML model has its own set of hyper-parameters that need to be tuned as they significantly impact the overall prediction accuracy of the trained model. These two forms of tuning have been studied extensively in the literature but only in isolation from each other. This position paper identifies the necessity for a combined system and ML model tuning approach based on a thorough experimental study. In particular, experimental results have revealed unexpected and complex interactions between the choices of system configuration and hyper-parameters, and their impact on both application and model performance. These findings open up new research directions in the field of self-managing stream processing systems.	en_US
dc.language.iso	en	en_US
dc.relation.ispartof	Proceedings - 2022 IEEE 38th International Conference on Data Engineering Workshops, ICDEW 2022	en_US
dc.rights	© Elsevier B.V.	en_US
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	hyper-parameter tuning	en_US
dc.subject	machine learning	en_US
dc.subject	stream processing	en_US
dc.subject	system parameter tuning	en_US
dc.title	Exploring System and Machine Learning Performance Interactions when Tuning Distributed Data Stream Applications	en_US
dc.type	Conference Papers	en_US
dc.collaboration	Cyprus University of Technology	en_US
dc.subject.category	Electrical Engineering - Electronic Engineering - Information Engineering	en_US
dc.country	Cyprus	en_US
dc.subject.field	Engineering and Technology	en_US
dc.identifier.doi	10.1109/ICDEW55742.2022.00008	en_US
dc.identifier.scopus	2-s2.0-85134883801	-
dc.identifier.url	https://api.elsevier.com/content/abstract/scopus_id/85134883801	-
cut.common.academicyear	2021-2022	en_US
dc.identifier.spage	24	en_US
dc.identifier.epage	29	en_US
item.openairecristype	http://purl.org/coar/resource_type/c_c94f	-
item.grantfulltext	none	-
item.cerifentitytype	Publications	-
item.fulltext	No Fulltext	-
item.languageiso639-1	en	-
item.openairetype	conferenceObject	-
crisitem.author.dept	Department of Electrical Engineering, Computer Engineering and Informatics	-
crisitem.author.faculty	Faculty of Engineering and Technology	-
crisitem.author.orcid	0000-0002-8717-1691	-
crisitem.author.parentorg	Faculty of Engineering and Technology	-
Appears in Collections:	Δημοσιεύσεις σε συνέδρια /Conference papers or poster or presentation

CORE Recommender

Show simple item record

SCOPUS^TM
Citations

1

checked on Mar 14, 2024

Page view(s)

103

Last Week
0

Last month
7

checked on May 21, 2024

Google Scholar^TM

Check

Altmetric

This item is licensed under a Creative Commons License

SCOPUSTM Citations

Page view(s)

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

Google Scholar^TM