Implicit Parallelism Through Deep Language Embedding

Alexandrov, A., Kunft, A., Katsifodimos, A., Schüler, F., Thamsen, L., Kao, O., Herb, T. and Markl, V. (2015) Implicit Parallelism Through Deep Language Embedding. In: International Conference on Management of Data (SIGMOD/PODS '15), Melbourne, Australia, 31 May - 04 Jun 2015, pp. 47-61. ISBN 9781450327589 (doi: 10.1145/2723372.2750543)

[img] Text
268125.pdf - Accepted Version
Restricted to Repository staff only

504kB

Abstract

The appeal of MapReduce has spawned a family of systems that implement or extend it. In order to enable parallel collection processing with User-Defined Functions (UDFs), these systems expose extensions of the MapReduce programming model as library-based dataflow APIs that are tightly coupled to their underlying runtime engine. Expressing data analysis algorithms with complex data and control flow structure using such APIs reveals a number of limitations that impede programmer's productivity. In this paper we show that the design of data analysis languages and APIs from a runtime engine point of view bloats the APIs with low-level primitives and affects programmer's productivity. Instead, we argue that an approach based on deeply embedding the APIs in a host language can address the shortcomings of current data analysis languages. To demonstrate this, we propose a language for complex data analysis embedded in Scala, which (i) allows for declarative specification of dataflows and (ii) hides the notion of data-parallelism and distributed runtime behind a suitable intermediate representation. We describe a compiler pipeline that facilitates efficient data-parallel processing without imposing runtime engine-bound syntactic or semantic restrictions on the structure of the input programs. We present a series of experiments with two state-of-the-art systems that demonstrate the optimization potential of our approach.

Item Type:Conference Proceedings
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Thamsen, Dr Lauritz
Authors: Alexandrov, A., Kunft, A., Katsifodimos, A., Schüler, F., Thamsen, L., Kao, O., Herb, T., and Markl, V.
College/School:College of Science and Engineering > School of Computing Science
Publisher:ACM
ISBN:9781450327589

University Staff: Request a correction | Enlighten Editors: Update this record