Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/19036
Title: AutoCache: Employing machine learning to automate caching in distributed file systems
Authors: Herodotou, Herodotos 
Major Field of Science: Natural Sciences
Field Category: Computer and Information Sciences
Keywords: Automated caching;Distributed file systems
Issue Date: 1-Jul-2019
Source: IEEE 35th International Conference on Data Engineering Workshops, 2019, 8-12 April, Macao, China
Conference: IEEE International Conference on Data Engineering Workshops 
Abstract: The use of computational platforms such as Hadoop and Spark is growing rapidly as a successful paradigm for processing large-scale data residing in distributed file systems like HDFS. Increasing memory sizes have recently led to the introduction of caching and in-memory file systems. However, these systems lack any automated caching mechanisms for storing data in memory. This paper presents AutoCache, a caching framework that automates the decisions for when and which files to store in, or remove from, the cache for increasing system performance. The decisions are based on machine learning models that track and predict file access patterns from evolving data processing workloads. Our evaluation using real-world workload traces from a Facebook production cluster compares our approach with several other policies and showcases significant benefits in terms of both workload performance and cluster efficiency.
URI: https://hdl.handle.net/20.500.14279/19036
ISSN: 978-1-7281-0890-2
Rights: © IEEE
Attribution-NonCommercial-NoDerivatives 4.0 International
Type: Conference Papers
Affiliation : Cyprus University of Technology 
Publication Type: Peer Reviewed
Appears in Collections:Δημοσιεύσεις σε συνέδρια /Conference papers or poster or presentation

CORE Recommender
Show full item record

Page view(s) 20

307
Last Week
0
Last month
9
checked on Nov 21, 2024

Google ScholarTM

Check


This item is licensed under a Creative Commons License Creative Commons