ODIN: Overcoming Dynamic Interference in iNference Pipelines

Noor Soomro, P., Papadopoulou, N. and Pericàs, M. (2023) ODIN: Overcoming Dynamic Interference in iNference Pipelines. In: 29th International Conference on Parallel and Distributed Computing, Limassol, Cyprus, 28 Aug - 01 Sep 2023, pp. 169-186. ISBN 9783031396977 (doi: 10.1007/978-3-031-39698-4_12)

[img] Text
320537.pdf - Accepted Version
Restricted to Repository staff only until 24 August 2024.

602kB

Abstract

As an increasing number of businesses becomes powered by machine-learning, inference becomes a core operation, with a growing trend to be offered as a service. In this context, the inference task must meet certain service-level objectives (SLOs), such as high throughput and low latency. However, these targets can be compromised by interference caused by long- or short-lived co-located tasks. Prior works focus on the generic problem of co-scheduling to mitigate the effect of interference on the performance-critical task. In this work, we focus on inference pipelines and propose ODIN, a technique to mitigate the effect of interference on the performance of the inference task, based on the online scheduling of the pipeline stages. Our technique detects interference online and automatically re-balances the pipeline stages to mitigate the performance degradation of the inference task. We demonstrate that ODIN successfully mitigates the effect of interference, sustaining the latency and throughput of CNN inference, and outperforms the least-loaded scheduling (LLS), a common technique for interference mitigation. Additionally, it is effective in maintaining service-level objectives for inference, and it is scalable to large network models executing on multiple processing elements.

Item Type:Conference Proceedings
Additional Information:This work has received funding from the project PRIDE from the Swedish Foundation for Strategic Research with reference number CHI19-0048. The computations were enabled by resources provided by the Swedish National Infrastructure for Computing (SNIC) at NSC, partially funded by the Swedish Research Council through grant agreement no. 2018-05973.
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Papadopoulou, Dr Nikela
Authors: Noor Soomro, P., Papadopoulou, N., and Pericàs, M.
College/School:College of Science and Engineering > School of Computing Science
ISSN:0302-9743
ISBN:9783031396977
Copyright Holders:Copyright © 2023 The Authors under exclusive license to Springer Nature Switzerland AG 2023
First Published:First published in Proceedings of Euro-Par 2023: Parallel Processing. Euro-Par 2023. Lecture Notes in Computer Science, vol. 14100
Publisher Policy:Reproduced in accordance with the copyright policy of the publisher

University Staff: Request a correction | Enlighten Editors: Update this record