In this extended abstract, we present the performance achieved with astrophysical N-body simulations with individual timestep and direct force calculation on the GRAPE-6 special-purpose computer. The achieved performance number is 992 Gflops for a long calculation with 768k particles, and 1.11 Tflops for a short test calculation with 1 million particles.
The number of particles used is fairly large. Therefore, fast algorithms such as the Barnes-Hut tree algorithm might in principle outperform the direct summation algorithm we adopted here. However, as we've discussed in the introduction, there is no massively-parallel implementation of tree algorithm and individual timestep with accuracy high enough for this kind of problem. Therefore, we believe the speed we achieved is the fastest one can do with currently available combinations of hardware and software.
We thank Piet Hut for reading the manuscript. This work is supported by the Research for the Future Program of Japan Society for the Promotion of Science (JSPS-RFTF97P01102).