Ntarmos, N. , Patlakas, I. and Triantafillou, P. (2014) Rank join queries in NoSQL databases. Proceedings of the VLDB Endowment, 7(7), pp. 493-504.
|
Text
89359.pdf - Published Version Available under License Creative Commons Attribution Non-commercial No Derivatives. 595kB |
Publisher's URL: http://www.vldb.org/pvldb/vol7.html
Abstract
Rank (i.e., top-k) join queries play a key role in modern analytics tasks. However, despite their importance and unlike centralized settings, they have been completely overlooked in cloud NoSQL settings. We attempt to fill this gap: We contribute a suite of solutions and study their performance comprehensively. Baseline solutions are ordered using SQLlike languages (like Hive and Pig), based on MapReduce jobs. We first provide solutions that are based on specialized indices, which may themselves be accessed using either MapReduce or coordinator-based strategies. The first index-based solution is based on inverted indices, which are accessed with MapReduce jobs. The second index-based solution adapts a popular centralized rank-join algorithm. We further contribute a novel statistical structure comprising histograms and Bloom filters, which forms the basis for the third index-based solution. We provide (i) MapReduce algorithms showing how to build these indices and statistical structures, (ii) algorithms to allow for online updates to these indices, and (iii) query processing algorithms utilizing them. We implemented all algorithms in Hadoop (HDFS) and HBase and tested them on TPC-H datasets of various scales, utilizing different queries on tables of various sizes and different score-attribute distributions. We ported our implementations to Amazon EC2 and "in-house" lab clusters of various scales. We provide performance results for three metrics: query execution time, network bandwidth consumption, and dollar-cost for query execution.
Item Type: | Articles |
---|---|
Status: | Published |
Refereed: | Yes |
Glasgow Author(s) Enlighten ID: | Triantafillou, Professor Peter and Ntarmos, Dr Nikos |
Authors: | Ntarmos, N., Patlakas, I., and Triantafillou, P. |
College/School: | College of Science and Engineering > School of Computing Science |
Journal Name: | Proceedings of the VLDB Endowment |
Journal Abbr.: | PVLDB |
Publisher: | VLDB Endowment Inc. |
ISSN: | 1047-7349 |
ISSN (Online): | 1047-7349 |
Copyright Holders: | Copyright © 2014 VLDB Endowment |
First Published: | First published in Proceedings of the VLDB Endowment 7(7):493-504 |
Publisher Policy: | Reproduced under a Creative Commons License |
University Staff: Request a correction | Enlighten Editors: Update this record