High-level programming of stencil computations on multi-GPU systems using the SkelCL library

Steuwer, M. , Haidl, M., Breuer, S. and Gorlatch, S. (2014) High-level programming of stencil computations on multi-GPU systems using the SkelCL library. Parallel Processing Letters, 24(3), 1441005. (doi:10.1142/S0129626414410059)

[img]
Preview
Text
148974.pdf - Accepted Version

1MB

Abstract

The implementation of stencil computations on modern, massively parallel systems with GPUs and other accelerators currently relies on manually-tuned coding using low-level approaches like OpenCL and CUDA. This makes development of stencil applications a complex, time-consuming, and error-prone task. We describe how stencil computations can be programmed in our SkelCL approach that combines high-level programming abstractions with competitive performance on multi-GPU systems. SkelCL extends the OpenCL standard by three high-level features: 1) pre-implemented parallel patterns (a.k.a. skeletons); 2) container data types for vectors and matrices; 3) automatic data (re)distribution mechanism. We introduce two new SkelCL skeletons which specifically target stencil computations – MapOverlap and Stencil – and we describe their use for particular application examples, discuss their efficient parallel implementation, and report experimental results on systems with multiple GPUs. Our evaluation of three real-world applications shows that stencil code written with SkelCL is considerably shorter and offers competitive performance to hand-tuned OpenCL code.

Item Type:Articles
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Steuwer, Dr Michel
Authors: Steuwer, M., Haidl, M., Breuer, S., and Gorlatch, S.
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
College/School:College of Science and Engineering > School of Computing Science
Journal Name:Parallel Processing Letters
Publisher:World Scientific Publishing
ISSN:0129-6264
ISSN (Online):1793-642X
Copyright Holders:Copyright © 2014 World Scientific Publishing Company
First Published:First published in Parallel Processing Letters 24(3): 1441005
Publisher Policy:Reproduced in accordance with the publisher copyright policy

University Staff: Request a correction | Enlighten Editors: Update this record