Reducing FPGA Memory Footprint of Stencil Codes through Automatic Extraction of Memory Patterns

Szafarczyk, R., Nabi, S. W. and Vanderbauwhede, W. (2023) Reducing FPGA Memory Footprint of Stencil Codes through Automatic Extraction of Memory Patterns. In: 32nd International Conference on Field-Programmable Logic and Applications (FPL 2022), Belfast, United Kingdom, 29 August - 2 September 2022, pp. 148-152. ISBN 9781665473903 (doi: 10.1109/FPL57034.2022.00033)

[img] Text
278313.pdf - Accepted Version
Available under License Creative Commons Attribution.



FPGAs are attractive for scientific high-performance computing due to their potential for high performance-per-Watt. Stencil codes in scientific applications are difficult to optimize on FPGAs, because of redundant, non-contiguous memory accesses to relatively low bandwidth DRAM. In this paper, we present an algorithm to aggressively reduce on-chip block RAM (BRAM) and off-chip DRAM utilisation of stencil codes running on FPGAs. The algorithm extracts memory accesses from computational pipelines and removes all redundant intermediate arrays, including those used for stencil buffering, by trading DRAM accesses for computation. The algorithm is based on rewrite-rules on a strict functional representation derived from Fortran code and generates provably correct, optimized code. Typical FPGA implementations store the stencil window in on-chip shift registers implemented in BRAMs; we use only DRAM and optimize the memory accesses instead. Our approach dramatically reduces BRAM usage so that the domain size is only limited by available DRAM. We report a drop of 78% and 18% in BRAM usage in 3-D and 2-D stencil codes compared to a manual implementation using shift registers while staying competitive in performance or even improving performance-per-Watt.

Item Type:Conference Proceedings
Glasgow Author(s) Enlighten ID:Szafarczyk, Mr Robert and Vanderbauwhede, Professor Wim and Nabi, Dr Syed Waqar
Authors: Szafarczyk, R., Nabi, S. W., and Vanderbauwhede, W.
College/School:College of Science and Engineering > School of Computing Science
Copyright Holders:Copyright © 2022 IEEE
First Published:First published in 2022 32nd International Conference on Field-Programmable Logic and Applications (FPL)
Publisher Policy:Reproduced in accordance with the publisher copyright policy
Related URLs:

University Staff: Request a correction | Enlighten Editors: Update this record

Project CodeAward NoProject NamePrincipal InvestigatorFunder's NameFunder RefLead Dept
168656Exploiting Parallelism through Type Transformations for Hybrid Manycore Systems.Wim VanderbauwhedeEngineering and Physical Sciences Research Council (EPSRC)EP/L00058X/1Computing Science