Statistical structures for internet-scale data management

Ntarmos, N. , Triantafillou, P. and Weikum, G. (2009) Statistical structures for internet-scale data management. VLDB Journal, 18(6), pp. 1279-1312. (doi: 10.1007/s00778-009-0140-7)

76009.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial.



Efficient query processing in traditional database management systems relies on statistics on base data. For centralized systems, there is a rich body of research results on such statistics, from simple aggregates to more elaborate synopses such as sketches and histograms. For Internet-scale distributed systems, on the other hand, statistics management still poses major challenges. With the work in this paper we aim to endow peer-to-peer data management over structured overlays with the power associated with such statistical information, with emphasis on meeting the scalability challenge. To this end, we first contribute efficient, accurate, and decentralized algorithms that can compute key aggregates such as Count, CountDistinct, Sum, and Average. We show how to construct several types of histograms, such as simple Equi-Width, Average-Shifted Equi-Width, and Equi-Depth histograms. We present a full-fledged open-source implementation of these tools for distributed statistical synopses, and report on a comprehensive experimental performance evaluation, evaluating our contributions in terms of efficiency, accuracy, and scalability.

Item Type:Articles
Glasgow Author(s) Enlighten ID:Triantafillou, Professor Peter and Ntarmos, Dr Nikos
Authors: Ntarmos, N., Triantafillou, P., and Weikum, G.
College/School:College of Science and Engineering > School of Computing Science
Journal Name:VLDB Journal
ISSN (Online):0949-877X
Published Online:05 March 2009
Copyright Holders:Copyright © 2009 Springer-Verlag
First Published:First published in VLDB Journal 18(6):1279-1312
Publisher Policy:Reproduced under a Creative Commons License

University Staff: Request a correction | Enlighten Editors: Update this record