Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/29886
DC FieldValueLanguage
dc.contributor.authorOdysseos, Lambros-
dc.contributor.authorHerodotou, Herodotos-
dc.date.accessioned2023-07-17T10:05:06Z-
dc.date.available2023-07-17T10:05:06Z-
dc.date.issued2022-05-09-
dc.identifier.citation38th IEEE International Conference on Data Engineering Workshops, ICDEW 2022Virtual, Kuala Lumpur, Malaysia, 9 - 11 May 2022en_US
dc.identifier.isbn9781665481045-
dc.identifier.urihttps://hdl.handle.net/20.500.14279/29886-
dc.description.abstractDeploying machine learning (ML) applications over distributed stream processing engines (DSPEs) such as Apache Spark Streaming is a complex procedure that requires extensive tuning along two dimensions. First, DSPEs have a vast array of system configuration parameters (such as degree of parallelism, memory buffer sizes, etc.) that need to be optimized to achieve the desired levels of latency and/or throughput. Second, each ML model has its own set of hyper-parameters that need to be tuned as they significantly impact the overall prediction accuracy of the trained model. These two forms of tuning have been studied extensively in the literature but only in isolation from each other. This position paper identifies the necessity for a combined system and ML model tuning approach based on a thorough experimental study. In particular, experimental results have revealed unexpected and complex interactions between the choices of system configuration and hyper-parameters, and their impact on both application and model performance. These findings open up new research directions in the field of self-managing stream processing systems.en_US
dc.language.isoenen_US
dc.relation.ispartofProceedings - 2022 IEEE 38th International Conference on Data Engineering Workshops, ICDEW 2022en_US
dc.rights© Elsevier B.V.en_US
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjecthyper-parameter tuningen_US
dc.subjectmachine learningen_US
dc.subjectstream processingen_US
dc.subjectsystem parameter tuningen_US
dc.titleExploring System and Machine Learning Performance Interactions when Tuning Distributed Data Stream Applicationsen_US
dc.typeConference Papersen_US
dc.collaborationCyprus University of Technologyen_US
dc.subject.categoryElectrical Engineering - Electronic Engineering - Information Engineeringen_US
dc.countryCyprusen_US
dc.subject.fieldEngineering and Technologyen_US
dc.identifier.doi10.1109/ICDEW55742.2022.00008en_US
dc.identifier.scopus2-s2.0-85134883801-
dc.identifier.urlhttps://api.elsevier.com/content/abstract/scopus_id/85134883801-
cut.common.academicyear2021-2022en_US
dc.identifier.spage24en_US
dc.identifier.epage29en_US
item.openairecristypehttp://purl.org/coar/resource_type/c_c94f-
item.grantfulltextnone-
item.cerifentitytypePublications-
item.fulltextNo Fulltext-
item.languageiso639-1en-
item.openairetypeconferenceObject-
crisitem.author.deptDepartment of Electrical Engineering, Computer Engineering and Informatics-
crisitem.author.facultyFaculty of Engineering and Technology-
crisitem.author.orcid0000-0002-8717-1691-
crisitem.author.parentorgFaculty of Engineering and Technology-
Appears in Collections:Δημοσιεύσεις σε συνέδρια /Conference papers or poster or presentation
CORE Recommender
Show simple item record

SCOPUSTM   
Citations

1
checked on Mar 14, 2024

Page view(s)

103
Last Week
0
Last month
7
checked on May 21, 2024

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons