Currently, a number of people are using GRAPE systems for SPH simulations (see, e.g., [Ste96]). In these simulations, GRAPE is used to calculate gravity and to construct the list of neighbor particles. The host handles the SPH interaction between neighbors. The calculation of SPH interaction consumes fairly large fraction of the total CPU time.
If SPH interaction can also be handled by some specialized hardware like GRAPE, we can achieve further speedup. The speedup we can achieve is not very large, typically around a factor of 10 or less. This is because the calculation cost of SPH interaction is still and not as large as that of gravity. On the other hand, this fact implies that we do not need a very fast hardware.
In GRAPE-6 project, we will try a relatively new approach, so-called ``reconfigurable computing'' [BA96], to accelerate SPH and similar applications. An alternative is to develop a hardware specialized to SPH [YOT96]. However, whether the high initial cost of a custom LSI can be justified by a relatively modest speedup is not clear. The basic idea of ``reconfigurable computing'' is to use a programmable LSI chip (field-programmable gate arrays or FPGA) to implement (part of ) applications. Currently, FPGAs with nominal gate count of 100,000 are available. This number is about a factor of 100 smaller than that for a full custom LSI, but may be sufficient to implement single pipeline for SPH calculation.
The readers interested in FPGA and reconfigurable computing are referred to [BA96]. The bottom line is that we may be able to use them as pipeline processors more flexible than hardwired GRAPE pipelines and at the same time to achieve price performance better than that of programmable general-purpose computers. Of course, it is also true that reconfigurable computing is neither as flexible as general-purpose computer, nor as efficient as GRAPE. Thus, it cannot directly compete with either of them. However, for the part of computation which is relatively time consuming, but much less so compared to gravitational force calculation, reconfigurable computing would offer an ideal solution.
Thus, GRAPE-6 might become a heterogeneous computer with three, not two, components (figure 4). We may be able to use the reconfigurable part for various applications, such as the calculation of van der Waals force in molecular dynamics and evaluation and shifting of spherical harmonics in the fast multipole method.
Figure 4: Extended GRAPE architecture