Julia for HPC

Julia for HPC

Many researchers may face this problem developing numerical applications. We prototype using a high-level language appealing for its eye to program, readability, nice plotting, very talkative debugger. When it comes to productions runs, we like to translate the prototypes in lower-level compiler languages to benefit from runtime performance, parallelisation possibilities but loosing many interesting features from high-level languages.

Our contribution presented at JuliaCon 2019 (Baltimore MA, USA) is an illustration of Julia solving “the two language problem”. We replace our Matlab prototype and the CUDA C + MPI production code by a single Julia code that serves both prototyping and production tasks. We showcase the port to Julia of a massively parallel Multi-GPU hydro-mechanical stencil-based solver in 3-D. The iterative solver can be applied to a wide range of coupled differential equations.

Figure 1. Weak scaling of the parallel GPU MPI hydro-mechanical solver. We report the parallel efficiency for both the Julia and the CUDA C implementation on 1024 and 5200 GPUs (full machine) respectively, on the Piz Daint hybrid Cray XC 50 at the Swiss National Computing Centre, CSCS.

We report a close to optimal weak scaling on 1024 NVIDIA Tesla P100 GPUs on the hybrid Cray XC-50 “Piz Daint” supercomputer at the Swiss National Supercomputing Centre, CSCS (Figure 1). We compare these results obtained with our Julia prototype to a reference scaling realised using the Multi-GPU production code solver written in CUDA C + MPI that achieved a high performance and a nearly ideal parallel efficiency on up to 5120 NVIDIA Tesla P100 GPUs “Piz Daint”. Soon in press.