Please use this identifier to cite or link to this item:
https://hdl.handle.net/20.500.14279/23024
Title: | Trident: Task Scheduling over Tiered Storage Systems in Big Data Platforms | Authors: | Herodotou, Herodotos Kakoulli, Elena |
Major Field of Science: | Natural Sciences | Field Category: | Computer and Information Sciences | Keywords: | Tiered storage system;Tiered storage;Pruning algorithms;Storage tiers;Spark;Hadoop | Issue Date: | May-2021 | Source: | Proceedings of the VLDB Endowment, 2021, vol. 14, no. 9, pp. 1570-1582 | Volume: | 14 | Issue: | 9 | Start page: | 1570 | End page: | 1582 | Link: | http://vldb.org/pvldb/vol14/p1570-herodotou.pdf | Journal: | Proceedings of the VLDB Endowment | Abstract: | The recent advancements in storage technologies have popularized the use of tiered storage systems in data-intensive compute clusters. The Hadoop Distributed File System (HDFS), for example, now supports storing data in memory, SSDs, and HDDs, while OctopusFS and hatS offer fine-grained storage tiering solutions. However, the task schedulers of big data platforms (such as Hadoop and Spark) will assign tasks to available resources only based on data locality information, and completely ignore the fact that local data is now stored on a variety of storage media with different performance characteristics. This paper presents Trident, a principled task scheduling approach that is designed to make optimal task assignment decisions based on both locality and storage tier information. Trident formulates task scheduling as a minimum cost maximum matching problem in a bipartite graph and uses a standard solver for finding the optimal solution. In addition, Trident utilizes two novel pruning algorithms for bounding the size of the graph, while still guaranteeing optimality. Trident is implemented in both Spark and Hadoop, and evaluated extensively using a realistic workload derived from Facebook traces as well as an industry-validated benchmark, demonstrating significant benefits in terms of application performance and cluster efficiency. | URI: | https://hdl.handle.net/20.500.14279/23024 | ISSN: | 21508097 | DOI: | 10.14778/3461535.3461545 | Rights: | This work is licensed under the Creative Commons BY-NC-ND 4.0 International License. Attribution-NonCommercial-NoDerivatives 4.0 International |
Type: | Article | Affiliation : | Cyprus University of Technology | Publication Type: | Peer Reviewed |
Appears in Collections: | Άρθρα/Articles |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
p1570-herodotou.pdf | Fulltext | 5.54 MB | Adobe PDF | View/Open |
CORE Recommender
SCOPUSTM
Citations
4
checked on Nov 9, 2023
WEB OF SCIENCETM
Citations
1
Last Week
0
0
Last month
0
0
checked on Oct 29, 2023
Page view(s)
411
Last Week
0
0
Last month
0
0
checked on Nov 21, 2024
Download(s)
532
checked on Nov 21, 2024
Google ScholarTM
Check
Altmetric
This item is licensed under a Creative Commons License