Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/29866
DC FieldValueLanguage
dc.contributor.authorOdysseos, Lambros-
dc.contributor.authorHerodotou, Herodotos-
dc.date.accessioned2023-07-14T09:04:36Z-
dc.date.available2023-07-14T09:04:36Z-
dc.date.issued2023-05-17-
dc.identifier.citationDistributed and Parallel Databases, 2023en_US
dc.identifier.issn09268782-
dc.identifier.urihttps://hdl.handle.net/20.500.14279/29866-
dc.description.abstractThe growing need to identify patterns in data and automate decisions based on them in near-real time, has stimulated the development of new machine learning (ML) applications processing continuous data streams. However, the deployment of ML applications over distributed stream processing engines (DSPEs) such as Apache Spark Streaming is a complex procedure that requires extensive tuning along two dimensions. First, DSPEs have a plethora of system configuration parameters, like degree of parallelism, memory buffer sizes, etc., that have a direct impact on application throughput and/or latency, and need to be optimized. Second, ML models have their own set of hyperparameters that require tuning as they can affect the overall prediction accuracy of the trained model significantly. These two forms of tuning have been studied extensively in the literature but only in isolation from each other. This manuscript presents a comprehensive experimental study that combines system configuration and hyperparameter tuning of ML applications over DSPEs. The experimental results reveal unexpected and complex interactions between the choices of system configurations and hyperparameters, and their impact on both application and model performance. These insights motivate the need for new combined system and ML model tuning approaches, and open up new research directions in the field of self-managing distributed stream processing systems.en_US
dc.formatpdfen_US
dc.language.isoenen_US
dc.rights© The Author(s)en_US
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectHyper-parameter tuningen_US
dc.subjectMachine learningen_US
dc.subjectStream processingen_US
dc.subjectSystem parameter tuningen_US
dc.titleOn combining system and machine learning performance tuning for distributed data stream applicationsen_US
dc.typeArticleen_US
dc.collaborationCyprus University of Technologyen_US
dc.subject.categoryElectrical Engineering - Electronic Engineering - Information Engineeringen_US
dc.journalsOpen Accessen_US
dc.countryCyprusen_US
dc.subject.fieldEngineering and Technologyen_US
dc.publicationPeer Revieweden_US
dc.identifier.doi10.1007/s10619-023-07434-0en_US
dc.identifier.scopus2-s2.0-85159710158-
dc.identifier.urlhttps://api.elsevier.com/content/abstract/scopus_id/85159710158-
cut.common.academicyear2022-2023en_US
item.openairetypearticle-
item.cerifentitytypePublications-
item.languageiso639-1en-
item.fulltextWith Fulltext-
item.openairecristypehttp://purl.org/coar/resource_type/c_6501-
item.grantfulltextopen-
crisitem.author.deptDepartment of Electrical Engineering, Computer Engineering and Informatics-
crisitem.author.facultyFaculty of Engineering and Technology-
crisitem.author.orcid0000-0002-8717-1691-
crisitem.author.parentorgFaculty of Engineering and Technology-
Appears in Collections:Άρθρα/Articles
Files in This Item:
File Description SizeFormat
herodotou 1.pdfFull text3.29 MBAdobe PDFView/Open
CORE Recommender
Show simple item record

Page view(s)

153
Last Week
1
Last month
10
checked on Aug 29, 2024

Download(s) 20

140
checked on Aug 29, 2024

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons