Achieving High-Performance the Functional Way: A Functional Pearl on Expressing High-Performance Optimizations as Rewrite Strategies

Hagedorn, B., Lenfers, J., Koehler, T., Qin, X., Gorlatch, S. and Steuwer, M. (2020) Achieving High-Performance the Functional Way: A Functional Pearl on Expressing High-Performance Optimizations as Rewrite Strategies. In: 25th ACM SIGPLAN International Conference on Functional Programming (ICFP 2020), Online Only, 24-26 Aug 2020, p. 92. (doi: 10.1145/3408974)

[img]
Preview
Text
220121.pdf - Published Version
Available under License Creative Commons Attribution.

853kB

Abstract

Optimizing programs to run efficiently on modern parallel hardware is hard but crucial for many applications. The predominantly used imperative languages - like C or OpenCL - force the programmer to intertwine the code describing functionality and optimizations. This results in a portability nightmare that is particularly problematic given the accelerating trend towards specialized hardware devices to further increase efficiency. Many emerging DSLs used in performance demanding domains such as deep learning or high-performance image processing attempt to simplify or even fully automate the optimization process. Using a high-level - often functional - language, programmers focus on describing functionality in a declarative way. In some systems such as Halide or TVM, a separate schedule specifies how the program should be optimized. Unfortunately, these schedules are not written in well-defined programming languages. Instead, they are implemented as a set of ad-hoc predefined APIs that the compiler writers have exposed. In this functional pearl, we show how to employ functional programming techniques to solve this challenge with elegance. We present two functional languages that work together - each addressing a separate concern. RISE is a functional language for expressing computations using well known functional data-parallel patterns. ELEVATE is a functional language for describing optimization strategies. A high-level RISE program is transformed into a low-level form using optimization strategies written in ELEVATE . From the rewritten low-level program high-performance parallel code is automatically generated. In contrast to existing high-performance domain-specific systems with scheduling APIs, in our approach programmers are not restricted to a set of built-in operations and optimizations but freely define their own computational patterns in RISE and optimization strategies in ELEVATE in a composable and reusable way. We show how our holistic functional approach achieves competitive performance with the state-of-the-art imperative systems Halide and TVM.

Item Type:Conference Proceedings
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Koehler, Mr Thomas and Steuwer, Dr Michel and Qin, Ms Xueying
Authors: Hagedorn, B., Lenfers, J., Koehler, T., Qin, X., Gorlatch, S., and Steuwer, M.
College/School:College of Science and Engineering > School of Computing Science
ISSN:2475-1421
Copyright Holders:Copyright © 2020 The Authors
Publisher Policy:Reproduced under a Creative Commons licence

University Staff: Request a correction | Enlighten Editors: Update this record