Trident: Task Scheduling over Tiered Storage Systems in Big Data Platforms

Herodotou, Herodotos; Kakoulli, Elena

doi:10.14778/3461535.3461545

Trident: Task Scheduling over Tiered Storage Systems in Big Data Platforms

Journal

Proceedings of the VLDB Endowment

Date Issued

May 2021

Author(s)

Herodotou, Herodotos

Kakoulli, Elena

DOI

10.14778/3461535.3461545

Abstract

The recent advancements in storage technologies have popularized the use of tiered storage systems in data-intensive compute
clusters. The Hadoop Distributed File System (HDFS), for example, now supports storing data in memory, SSDs, and HDDs, while
OctopusFS and hatS offer fine-grained storage tiering solutions.
However, the task schedulers of big data platforms (such as Hadoop
and Spark) will assign tasks to available resources only based on
data locality information, and completely ignore the fact that local
data is now stored on a variety of storage media with different
performance characteristics. This paper presents Trident, a principled task scheduling approach that is designed to make optimal
task assignment decisions based on both locality and storage tier
information. Trident formulates task scheduling as a minimum
cost maximum matching problem in a bipartite graph and uses a
standard solver for finding the optimal solution. In addition, Trident utilizes two novel pruning algorithms for bounding the size
of the graph, while still guaranteeing optimality. Trident is implemented in both Spark and Hadoop, and evaluated extensively using
a realistic workload derived from Facebook traces as well as an
industry-validated benchmark, demonstrating significant benefits
in terms of application performance and cluster efficiency.

Subjects

Tiered storage system...

Tiered storage

Pruning algorithms

Storage tiers

Spark

Hadoop

File(s)

Name

p1570-herodotou.pdf

Size

5.41 MB

Format

Adobe PDF

Checksum (MD5)

953d40e861b6da8ec7640dd0c42da7f3

Trident: Task Scheduling over Tiered Storage Systems in Big Data Platforms

Explore by

Useful Links

Copyright Policies

Deposit your work to Ktisis