Learning Dependencies in Distributed Cloud Applications to Identify and Localize Anomalies

Scheinert, D., Acker, A., Thamsen, L., Geldenhuys, M. K. and Kao, O. (2021) Learning Dependencies in Distributed Cloud Applications to Identify and Localize Anomalies. In: 2nd Workshop on Cloud Intelligence (CloudIntelligence) at the 43th International Conference on Software Engineering (ICSE), 29 May 2021, pp. 7-12. ISBN 9781665445634 (doi: 10.1109/CloudIntelligence52565.2021.00011)

[img] Text
268158.pdf - Accepted Version



Operation and maintenance of large distributed cloud applications can quickly become unmanageably complex, putting human operators under immense stress when problems occur. Utilizing machine learning for identification and localization of anomalies in such systems supports human experts and enables fast mitigation. However, due to the various interdependencies of system components, anomalies do not only affect their origin but propagate through the distributed system. Taking this into account, we present Arvalus and its variant D-Arvalus, a neural graph transformation method that models system components as nodes and their dependencies and placement as edges to improve the identification and localization of anomalies. Given a series of metric KPIs, our method predicts the most likely system state - either normal or an anomaly class - and performs localization when an anomaly is detected. During our experiments, we simulate a distributed cloud application deployment and synthetically inject anomalies. The evaluation shows the generally good prediction performance of Arvalus and reveals the advantage of D-Arvalus which incorporates information about system component dependencies.

Item Type:Conference Proceedings
Additional Information:This work has been supported through grants by the German Ministry for Education and Research (BMBF) as BIFOLD (funding mark 01IS18025A).
Glasgow Author(s) Enlighten ID:Thamsen, Dr Lauritz
Authors: Scheinert, D., Acker, A., Thamsen, L., Geldenhuys, M. K., and Kao, O.
College/School:College of Science and Engineering > School of Computing Science
Published Online:07 September 2021
Copyright Holders:Copyright © 2021 IEEE
First Published:First published in 2021 IEEE/ACM International Workshop on Cloud Intelligence (CloudIntelligence): 7-12
Publisher Policy:Reproduced in accordance with the publisher copyright policy
Related URLs:

University Staff: Request a correction | Enlighten Editors: Update this record