HomeSC is the International Conference for
 High Performnance Computing, Networking, Storage and Analysis
scyourway
Award Finalist/Winner

SC Conference - Activity Details



42 TFlops Hierarchical N-body Simulations on GPUs with Applications in both Astrophysics and Turbulence

Authors:
Tsuyoshi Hamada  (Nagasaki University)
Rio Yokota  (University of Bristol)
Keigo Nitadori  (RIKEN)
Tetsu Narumi  (University of Electro-Communications)
Kenji Yasuoka  (Keio University)
Makoto Taiji  (RIKEN)
Kiyoshi Oguri  (Nagasaki University)
ACM Gordon Bell Finalists Session
Wednesday,  02:30PM - 03:00PM
Room D135-136
Abstract:
We have performed a hierarchical N-body simulation on a cluster of 256 GPUs. Unlike many previous N-body simulations on GPUs that scale as O(N^2), the present method calculates the O(NlogN) treecode and O(N) fast multipole method on the GPUs with unprecedented efficiency. We demonstrate the performance of our method by choosing one standard application --a gravitational N-body simulation-- and one non-standard application --the simulation of turbulence using vortex particles. The gravitational simulation using the treecode with 1.6 billion particles showed a sustained performance of 42 TFlops (28 TFlops when corrected). The vortex particle simulation of homogeneous isotropic turbulence using the FMM with 17 million particles showed a sustained performance of 20.2 TFlops. The overall cost of the hardware was 228,912 dollars, which results in a cost performance of 28,000,000/228,912=124 MFlops/$. The good scalability up to 256 GPUs demonstrates the good compatibility between our computational algorithm and the GPU architecture.
The full paper can be found in the ACM Digital Library and IEEE Computer Society
   Sponsors    ACM    IEEE