Skip to Main content Skip to Navigation
Conference papers

Multi-target C++ implementation of parallel skeletons

Wilfried Kirschenmann 1, 2 Laurent Plagne 1 Stéphane Vialle 2, 3
2 ALGORILLE - Algorithms for the Grid
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : This paper presents the design of an efficient multi-target (CPU+GPU) implementation for the Parallel_for skeleton. Emerging massively parallel architectures promise very high performances for a low cost. However, these architectures change faster than ever. Thus, optimization of codes becomes a very complex and time consumming task. We have identified the data storage as the main difference between the CPU and the GPU implementation of a code. We introduce an abstract data layout in order to adapt the data storage. Based on this layout, the utilization of Parallel_for skeleton allows to compile and execute the same program both on CPU and on GPU. Once compiled, the program runs close to the hardware limits.
Complete list of metadata
Contributor : Sébastien van Luchene Connect in order to contact the contributor
Submitted on : Monday, November 30, 2009 - 6:32:37 PM
Last modification on : Monday, December 6, 2021 - 6:08:02 PM

Links full text




Wilfried Kirschenmann, Laurent Plagne, Stéphane Vialle. Multi-target C++ implementation of parallel skeletons. 8th workshop on Parallel/High-Performance Object-Oriented Scientific Computing - POOSC'09, Jul 2009, Genova, Italy. pp.1-10, ⟨10.1145/1595655.1595662⟩. ⟨hal-00437542⟩



Les métriques sont temporairement indisponibles