Exploring System and Machine Learning Performance Interactions when Tuning Distributed Data Stream Applications

Odysseos, Lambros; Herodotou, Herodotos

doi:10.1109/ICDEW55742.2022.00008

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/29886

Title:	Exploring System and Machine Learning Performance Interactions when Tuning Distributed Data Stream Applications
Authors:	Odysseos, Lambros Herodotou, Herodotos
Major Field of Science:	Engineering and Technology
Field Category:	Electrical Engineering - Electronic Engineering - Information Engineering
Keywords:	hyper-parameter tuning;machine learning;stream processing;system parameter tuning
Issue Date:	9-May-2022
Source:	38th IEEE International Conference on Data Engineering Workshops, ICDEW 2022Virtual, Kuala Lumpur, Malaysia, 9 - 11 May 2022
Start page:	24
End page:	29
Journal:	Proceedings - 2022 IEEE 38th International Conference on Data Engineering Workshops, ICDEW 2022
Abstract:	Deploying machine learning (ML) applications over distributed stream processing engines (DSPEs) such as Apache Spark Streaming is a complex procedure that requires extensive tuning along two dimensions. First, DSPEs have a vast array of system configuration parameters (such as degree of parallelism, memory buffer sizes, etc.) that need to be optimized to achieve the desired levels of latency and/or throughput. Second, each ML model has its own set of hyper-parameters that need to be tuned as they significantly impact the overall prediction accuracy of the trained model. These two forms of tuning have been studied extensively in the literature but only in isolation from each other. This position paper identifies the necessity for a combined system and ML model tuning approach based on a thorough experimental study. In particular, experimental results have revealed unexpected and complex interactions between the choices of system configuration and hyper-parameters, and their impact on both application and model performance. These findings open up new research directions in the field of self-managing stream processing systems.
URI:	https://hdl.handle.net/20.500.14279/29886
ISBN:	9781665481045
DOI:	10.1109/ICDEW55742.2022.00008
Rights:	© Elsevier B.V. Attribution-NonCommercial-NoDerivatives 4.0 International
Type:	Conference Papers
Affiliation :	Cyprus University of Technology
Appears in Collections:	Δημοσιεύσεις σε συνέδρια /Conference papers or poster or presentation

CORE Recommender

Show full item record

SCOPUS^TM
Citations

1

checked on Mar 14, 2024

Page view(s)

165

Last Week
8

Last month
1

checked on Feb 19, 2025

Google Scholar^TM

Check

Altmetric

This item is licensed under a Creative Commons License

SCOPUSTM Citations

Page view(s)

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

Google Scholar^TM