Network-Aware Resource Management for Scalable Data Analytics Frameworks

Renner, T., Thamsen, L. and Kao, O. (2015) Network-Aware Resource Management for Scalable Data Analytics Frameworks. In: 2015 IEEE International Conference on Big Data (Big Data), Santa Clara, CA, USA, 29 Oct - 01 Nov 2015, pp. 2793-2800. ISBN 9781479999262 (doi: 10.1109/BigData.2015.7364083)

[img] Text
268127.pdf - Accepted Version
Restricted to Repository staff only

1MB

Abstract

Sharing cluster resources between multiple frameworks, applications and datasets is important for organizations doing large scale data analytics. It improves cluster utilization, avoids standalone clusters running only a single framework and allows data scientists to choose the best framework for each analysis task. Current systems for cluster resource management like YARN or Mesos achieve resource sharing using containers. Analytics frameworks execute their tasks in these containers. However, currently the container placement is based predominantly on available computing capabilities in terms of cores and memory, yet neglects to also take the network topology and data locations into account. In this paper, we propose a container placement approach that (a) takes the network topology into account to prevent network congestions in the core network and (b) places containers close to input data to improve data locality and reduce remote disk reads in distributed file systems. The main advantages of introducing topology- and data-awareness on the level of container placement is that multiple application frameworks benefit from improvements. We present a prototype integrated with Hadoop YARN and an evaluation with workloads consisting of different applications and datasets using Apache Flink. Our evaluation on a 64 core cluster, in which nodes are connected through a fat tree topology, shows promising results with speedups of up to 67% for network-intensive workloads.

Item Type:Conference Proceedings
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Thamsen, Dr Lauritz
Authors: Renner, T., Thamsen, L., and Kao, O.
College/School:College of Science and Engineering > School of Computing Science
Publisher:IEEE
ISBN:9781479999262

University Staff: Request a correction | Enlighten Editors: Update this record