|
|
 |
 |
Award Finalist/Winner |
SC Conference - Activity Details
42 TFlops Hierarchical N-body Simulations on GPUs with Applications in both Astrophysics and Turbulence
Authors:
|
Tsuyoshi Hamada
(Nagasaki University)
|
|
Rio Yokota
(University of Bristol)
|
|
Keigo Nitadori
(RIKEN)
|
|
Tetsu Narumi
(University of Electro-Communications)
|
|
Kenji Yasuoka
(Keio University)
|
|
Makoto Taiji
(RIKEN)
|
|
Kiyoshi Oguri
(Nagasaki University)
|
ACM Gordon Bell Finalists Session
|
Wednesday, 02:30PM - 03:00PM
|
|
Room D135-136
|
Abstract:
We have performed a hierarchical N-body simulation on a cluster of 256 GPUs. Unlike many previous N-body simulations on GPUs that scale as O(N^2), the present method calculates the O(NlogN) treecode and O(N) fast multipole method on the GPUs with unprecedented efficiency. We demonstrate the performance of our method by choosing one standard application --a gravitational N-body simulation-- and one non-standard application --the simulation of turbulence using vortex particles. The gravitational simulation using the treecode with 1.6 billion particles showed a sustained performance of 42 TFlops (28 TFlops when corrected). The vortex particle simulation of homogeneous isotropic turbulence using the FMM with 17 million particles showed a sustained performance of 20.2 TFlops. The overall cost of the hardware was 228,912 dollars, which results in a cost performance of 28,000,000/228,912=124 MFlops/$. The good scalability up to 256 GPUs demonstrates the good compatibility between our computational algorithm and the GPU architecture.
|
|
|