Pfister, B. J. J., Lickefett, W. S., Nitschke, J., Paul, S., Geldenhuys, M. K., Scheinert, D., Gontarska, K. and Thamsen, L. (2022) Rafiki: Task-level Capacity Planning in Distributed Stream Processing Systems. In: 27th International European Conference on Parallel and Distributed Computing (Euro-Par 2021), 30 August – 3 September 2021, pp. 352-363. ISBN 9783031061554 (doi: 10.1007/978-3-031-06156-1_28)
Text
268174.pdf - Accepted Version 483kB |
Abstract
Distributed Stream Processing is a valuable paradigm for reliably processing vast amounts of data at high throughput rates with low end-to-end latencies. Most systems of this type offer a fine-grained level of control to parallelize the computation of individual tasks within a streaming job. Adjusting the parallelism of tasks has a direct impact on the overall level of throughput a job can provide as well as the amount of resources required to provide an adequate level of service. However, finding optimal parallelism configurations that fall within the expected Quality of Service requirements is no small feat to accomplish. In this paper we present Rafiki, an approach to automatically determine optimal parallelism configurations for Distributed Stream Processing jobs. Here we conduct a number of proactive profiling runs to gather information about the processing capacities of individual tasks, thereby making the selection of specific utilization targets possible. Understanding the capacity information enables users to adequately provision resources so that streaming jobs can deliver the desired level of service at a reduced operational cost with predictable recovery times. We implemented Rafiki prototypically together with Apache Flink where we demonstrate its usefulness experimentally.
Item Type: | Conference Proceedings |
---|---|
Status: | Published |
Refereed: | Yes |
Glasgow Author(s) Enlighten ID: | Thamsen, Dr Lauritz |
Authors: | Pfister, B. J. J., Lickefett, W. S., Nitschke, J., Paul, S., Geldenhuys, M. K., Scheinert, D., Gontarska, K., and Thamsen, L. |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science Q Science > QA Mathematics > QA76 Computer software |
College/School: | College of Science and Engineering > School of Computing Science |
ISBN: | 9783031061554 |
Copyright Holders: | Copyright © 2022 Springer Nature Switzerland AG |
Publisher Policy: | Reproduced in accordance with the copyright policy of the publisher |
Related URLs: |
University Staff: Request a correction | Enlighten Editors: Update this record