Rafiki: Task-level Capacity Planning in Distributed Stream Processing Systems

Pfister, B. J. J., Lickefett, W. S., Nitschke, J., Paul, S., Geldenhuys, M. K., Scheinert, D., Gontarska, K. and Thamsen, L. (2022) Rafiki: Task-level Capacity Planning in Distributed Stream Processing Systems. In: 27th International European Conference on Parallel and Distributed Computing (Euro-Par 2021), 30 August – 3 September 2021, pp. 352-363. ISBN 9783031061554 (doi: 10.1007/978-3-031-06156-1_28)

[img] Text
268174.pdf - Accepted Version

483kB

Abstract

Distributed Stream Processing is a valuable paradigm for reliably processing vast amounts of data at high throughput rates with low end-to-end latencies. Most systems of this type offer a fine-grained level of control to parallelize the computation of individual tasks within a streaming job. Adjusting the parallelism of tasks has a direct impact on the overall level of throughput a job can provide as well as the amount of resources required to provide an adequate level of service. However, finding optimal parallelism configurations that fall within the expected Quality of Service requirements is no small feat to accomplish. In this paper we present Rafiki, an approach to automatically determine optimal parallelism configurations for Distributed Stream Processing jobs. Here we conduct a number of proactive profiling runs to gather information about the processing capacities of individual tasks, thereby making the selection of specific utilization targets possible. Understanding the capacity information enables users to adequately provision resources so that streaming jobs can deliver the desired level of service at a reduced operational cost with predictable recovery times. We implemented Rafiki prototypically together with Apache Flink where we demonstrate its usefulness experimentally.

Item Type:Conference Proceedings
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Thamsen, Dr Lauritz
Authors: Pfister, B. J. J., Lickefett, W. S., Nitschke, J., Paul, S., Geldenhuys, M. K., Scheinert, D., Gontarska, K., and Thamsen, L.
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
College/School:College of Science and Engineering > School of Computing Science
ISBN:9783031061554
Copyright Holders:Copyright © 2022 Springer Nature Switzerland AG
Publisher Policy:Reproduced in accordance with the copyright policy of the publisher
Related URLs:

University Staff: Request a correction | Enlighten Editors: Update this record