Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/30768
DC FieldValueLanguage
dc.contributor.authorLucas Filho, Edson Ramiro-
dc.contributor.authorLun, Yang-
dc.contributor.authorKebo, Fu-
dc.contributor.authorHerodotou, Herodotos-
dc.date.accessioned2023-11-09T10:01:47Z-
dc.date.available2023-11-09T10:01:47Z-
dc.date.issued2023-08-10-
dc.identifier.citation1st Workshop on AI for Systems, AI4Sys 2023, Orlando, Florida, 20 June 2023en_US
dc.identifier.isbn9798400701610-
dc.identifier.urihttps://hdl.handle.net/20.500.14279/30768-
dc.description.abstractModern data storage systems optimize data access by distributing data across multiple storage tiers and caches, based on numerous tiering and caching policies. The policies' decisions, and in particular the ones related to data prefetching, can severely impact the performance of the entire storage system. In recent years, various machine learning algorithms have been employed to model access patterns in complex data storage workloads. Even though data storage systems handle a constantly changing stream of file requests, current approaches continue to train their models offline in a batch-based approach. In this paper, we investigate the use of streaming machine learning to support data prefetching decisions in data storage systems as it introduces various advantages such as high training efficiency, high prediction accuracy, and high adaptability to changing workload patterns. After extracting a representative set of features in an online fashion, streaming machine learning models can be trained and tested while the system is running. To validate our methodology, we present one streaming classification model to predict the next file offset to be read in a file. We assess the model's performance using production traces provided by Huawei Technologies and demonstrate that streaming machine learning is a feasible approach with low memory consumption and minimal training delay, facilitating accurate predictions in real-time.en_US
dc.language.isoenen_US
dc.rights© ACMen_US
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectcaching policiesen_US
dc.subjectdata prefetchingen_US
dc.subjectmulti-tiered storage systemsen_US
dc.subjectstreaming machine learningen_US
dc.subjecttiering policiesen_US
dc.titleStreaming Machine Learning for Supporting Data Prefetching in Modern Data Storage Systemsen_US
dc.typeConference Papersen_US
dc.collaborationCyprus University of Technologyen_US
dc.collaborationHuawei Technologies Co. Ltden_US
dc.subject.categoryENGINEERING AND TECHNOLOGYen_US
dc.subject.categoryElectrical Engineering - Electronic Engineering - Information Engineeringen_US
dc.countryCyprusen_US
dc.countryChinaen_US
dc.subject.fieldEngineering and Technologyen_US
dc.relation.conferenceAI4Sys 2023 - Proceedings of the 1st Workshop on AI for Systemsen_US
dc.identifier.doi10.1145/3588982.3603608en_US
dc.identifier.scopus2-s2.0-85171568004-
dc.identifier.urlhttps://api.elsevier.com/content/abstract/scopus_id/85171568004-
cut.common.academicyear2022-2023en_US
item.grantfulltextnone-
item.openairecristypehttp://purl.org/coar/resource_type/c_c94f-
item.fulltextNo Fulltext-
item.languageiso639-1en-
item.cerifentitytypePublications-
item.openairetypeconferenceObject-
crisitem.author.deptDepartment of Electrical Engineering, Computer Engineering and Informatics-
crisitem.author.facultyFaculty of Engineering and Technology-
crisitem.author.orcid0000-0002-8717-1691-
crisitem.author.parentorgFaculty of Engineering and Technology-
Appears in Collections:Δημοσιεύσεις σε συνέδρια /Conference papers or poster or presentation
CORE Recommender
Show simple item record

Page view(s) 20

153
Last Week
5
Last month
5
checked on Dec 23, 2024

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons