author: | Thijs Wiefferink |
title: | Parallel Prefix Sum: A Case Study for GPGPU Program Verification |
keywords: | GPU programming, Verification, OpenCL, Parallel algorithms |
topics: | Case studies and Applications |
committee: |
Marieke Huisman
, Saeed Darabi |
started: | February 2014 |
end: | June 2015 |
Abstract
The Prefix Sum is an algorithm used as a building block for various other algorithms, for example radix sort, quicksort and lexically comparing strings. Implementing the Prefix Sum algorithm on the CPU is trivial, but a parallel approach with OpenCL is more complicated. An implementation in OpenCL has been made, and optimized to minimize branch divergence by comparing two different storage solutions. These storage solutions have been benchmarked in order to show the difference in execution time. In addition to performance, verification of the algorithm is important. The two aspects that need to be verified are the absence of data races and functional correctness. This paper describes a specification that covers the absence of data races, by using Permission-Based Separation Logic. The first part of this specification (the up-sweep phase of the algorithm) has been verified using the VerCors tool.