Please use this identifier to cite or link to this item:
https://hdl.handle.net/20.500.14279/19036
Title: | AutoCache: Employing machine learning to automate caching in distributed file systems | Authors: | Herodotou, Herodotos | Major Field of Science: | Natural Sciences | Field Category: | Computer and Information Sciences | Keywords: | Automated caching;Distributed file systems | Issue Date: | 1-Jul-2019 | Source: | IEEE 35th International Conference on Data Engineering Workshops, 2019, 8-12 April, Macao, China | Conference: | IEEE International Conference on Data Engineering Workshops | Abstract: | The use of computational platforms such as Hadoop and Spark is growing rapidly as a successful paradigm for processing large-scale data residing in distributed file systems like HDFS. Increasing memory sizes have recently led to the introduction of caching and in-memory file systems. However, these systems lack any automated caching mechanisms for storing data in memory. This paper presents AutoCache, a caching framework that automates the decisions for when and which files to store in, or remove from, the cache for increasing system performance. The decisions are based on machine learning models that track and predict file access patterns from evolving data processing workloads. Our evaluation using real-world workload traces from a Facebook production cluster compares our approach with several other policies and showcases significant benefits in terms of both workload performance and cluster efficiency. | URI: | https://hdl.handle.net/20.500.14279/19036 | ISSN: | 978-1-7281-0890-2 | Rights: | © IEEE Attribution-NonCommercial-NoDerivatives 4.0 International |
Type: | Conference Papers | Affiliation : | Cyprus University of Technology | Publication Type: | Peer Reviewed |
Appears in Collections: | Δημοσιεύσεις σε συνέδρια /Conference papers or poster or presentation |
CORE Recommender
This item is licensed under a Creative Commons License