Savva, F. , Anagnostopoulos, C. , Triantafillou, P. and Kolomvatsos, K. (2020) Large-scale data exploration using explanatory regression functions. ACM Transactions on Knowledge Discovery from Data, 14(6), 76. (doi: 10.1145/3410448)
![]() |
Text
220291.pdf - Accepted Version 3MB |
Abstract
Analysts wishing to explore multivariate data spaces, typically issue queries involving selection operators, i.e., range or equality predicates, which define data subspaces of potential interest. Then, they use aggregation functions, the results of which determine a subspace’s interestingness for further exploration and deeper analysis. However, Aggregate Query (AQ) results are scalars and convey limited information and explainability about the queried subspaces for enhanced exploratory analysis. Analysts have no way of identifying how these results are derived or how they change w.r.t query (input) parameter values. We address this shortcoming by aiding analysts to explore and understand data subspaces by contributing a novel explanation mechanism based on machine learning. We explain AQ results using functions obtained by a three-fold joint optimization problem which assume the form of explainable piecewise-linear regression functions. A key feature of the proposed solution is that the explanation functions are estimated using past executed queries. These queries provide a coarse grained overview of the underlying aggregate function (generating the AQ results) to be learned. Explanations for future, previously unseen AQs can be computed without accessing the underlying data and can be used to further explore the queried data subspaces, without issuing more queries to the backend analytics engine. We evaluate the explanation accuracy and efficiency through theoretically grounded metrics over real-world and synthetic datasets and query workloads.
Item Type: | Articles |
---|---|
Status: | Published |
Refereed: | Yes |
Glasgow Author(s) Enlighten ID: | Kolomvatsos, Dr Kostas and Anagnostopoulos, Dr Christos and Triantafillou, Professor Peter and Savva, Mr Fotis |
Authors: | Savva, F., Anagnostopoulos, C., Triantafillou, P., and Kolomvatsos, K. |
College/School: | College of Science and Engineering > School of Computing Science |
Journal Name: | ACM Transactions on Knowledge Discovery from Data |
Publisher: | Association for Computing Machinery |
ISSN: | 1556-4681 |
ISSN (Online): | 1556-472X |
Copyright Holders: | Copyright © 2020 Association for Computing Machinery |
First Published: | First published in ACM Transactions on Knowledge Discovery from Data 14(6):76 |
Publisher Policy: | Reproduced in accordance with the copyright policy of the publisher |
University Staff: Request a correction | Enlighten Editors: Update this record