An Adaptive Pencil Decomposition Library for Modern GPU Systems
Please login to view abstract download link
We have developed a new transpose library, cuDecomp, that automatically determines the optimal domain decomposition and communication backend with both C and Fortran interfaces. Aside from transpose operations, the library also supports simple halo exchanges. The performance of the library has been assessed in two Navier-Stokes solvers which require the calculation of Fast-Fourier Transforms (FFT): a tri-periodic pseudo-spectral solver for isotropic turbulence, and CaNS [3], a finite-difference solver for canonical turbulent flows, where the FFTs and a tridiagonal solver are used in its Poisson solver.