One is enough: distributed filtering for duplicate elimination

Koloniari, G., Ntarmos, N. , Pitoura, E. and Souravlias, D. (2011) One is enough: distributed filtering for duplicate elimination. In: 20th ACM International Conference on Information and Knowledge Management, Glasgow, UK, 24-28 Oct 2011,

Full text not currently available from Enlighten.

Abstract

The growth of online services has created the need for duplicate elimination in high-volume streams of events. The sheer volume of data in applications such as pay-per-click clickstream processing, RSS feed syndication and notification services in social sites such Twitter and Facebook makes traditional centralized solutions hard to scale. In this paper, we propose an approach based on distributed filtering. To this end, we introduce a suite of distributed Bloom filters that exploit different ways of partitioning the event space. To address the continuous nature of event delivery, the filters are extended to support sliding window semantics. Moreover, we examine locality-related tradeoffs and propose a tree-based architecture to allow for duplicate elimination across geographic locations. We cast the design space and present experimental results that demonstrate the pros and cons of our various solutions in different settings.

Item Type:Conference Proceedings
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Ntarmos, Dr Nikos
Authors: Koloniari, G., Ntarmos, N., Pitoura, E., and Souravlias, D.
College/School:College of Science and Engineering > School of Computing Science

University Staff: Request a correction | Enlighten Editors: Update this record