M2P 2023

An Adaptive Pencil Decomposition Library for Modern GPU Systems

  • Fatica, Massimiliano (NVIDIA)
  • Romero, Josh (NVIDIA)
  • Costa, Pedro (TU Delft)

We have developed a new transpose library, cuDecomp, that automatically determines the optimal domain decomposition and communication backend with both C and Fortran interfaces. Aside from transpose operations, the library also supports simple halo exchanges. The performance of the library has been assessed in two Navier-Stokes solvers which require the calculation of Fast-Fourier Transforms (FFT): a tri-periodic pseudo-spectral solver for isotropic turbulence, and CaNS [3], a finite-difference solver for canonical turbulent flows, where the FFTs and a tridiagonal solver are used in its Poisson solver.