Scheduling Recurring Distributed Dataflow Jobs Based on Resource Utilization and Interference

Thamsen, L., Rabier, B., Schmidt, F., Renner, T. and Kao, O. (2017) Scheduling Recurring Distributed Dataflow Jobs Based on Resource Utilization and Interference. In: 2017 IEEE International Congress on Big Data (BigData Congress), Honolulu, HI, USA, 25-30 June 2017, pp. 145-152. ISBN 9781538619964 (doi: 10.1109/BigDataCongress.2017.28)

[img] Text
268138.pdf - Accepted Version



Resource management systems like YARN or Mesos enable users to share cluster infrastructures by running analytics jobs in temporarily reserved containers. These containers are typically not isolated to achieve high degrees of overall resource utilizations despite the often fluctuating resource usage of single analytic jobs. However, some combinations of jobs utilize the resources better and interfere less with each others when running on the same nodes than others. This paper presents an approach for improving the resource utilization and job throughput when scheduling recurring data analysis jobs in shared cluster environments. Using a reinforcement learning algorithm, the scheduler continuously learns which jobs are best executed simultaneously on the cluster. Our evaluation of an implementation built on Hadoop YARN shows that this approach can increase resource utilization and decrease job runtimes. While interference between jobs can be avoided, co-locations of jobs with complementary resource usage are not yet always fully recognized. However, with a better measure of co-location goodness, our solution can be used to automatically adapt the scheduling to workloads with recurring batch jobs.

Item Type:Conference Proceedings
Additional Information:This work has been supported through grants by the German Science Foundation (DFG) as FOR 1306 Stratosphere and by the German Ministry for Education and Research (BMBF) as Berlin Big Data Center BBDC (funding mark 01IS14013A).
Glasgow Author(s) Enlighten ID:Thamsen, Dr Lauritz
Authors: Thamsen, L., Rabier, B., Schmidt, F., Renner, T., and Kao, O.
College/School:College of Science and Engineering > School of Computing Science
Published Online:11 September 2017
Copyright Holders:Copyright © 2017 IEEE
Publisher Policy:Reproduced in accordance with the publisher copyright policy
Related URLs:

University Staff: Request a correction | Enlighten Editors: Update this record