| Sign In to gain access to subscriptions and/or personal tools. |
A Detailed Performance Analysis of the Interpolation Supplemented Lattice Boltzmann Method on the Cray T3E and Cray X1A Detailed Performance Analysis of the Interpolation Supplemented Lattice Boltzmann Method on the Cray T3E and Cray X1Ethzürich, Institut Für Energietchnik, ML L 13 Sonneggstrassse 3, CH-8092 Zürich, Switzerland; Chikatamarla{at}lav.mavt.ethz.ch
Materials Process Design and Control Laboratory, Sibley Sschool of Mechanical and Aerospace Engineering, Cornell University, Ithica, NY 14853; BG74{at}Cornell.edu
Dept. of Mechanical Engineering Indian Institute of Technology Madras, Chennai, Madras 600-036, India; vbabu{at}Iitm.ac.in
Cray, Inc., 411 First Avenue S, Suite 600, Seattle, WA 98104-2860 stren{at}cray.com A detailed study of the parallel performance of the interpolation supplemented lattice Boltzmann (ISLB) method using SHMEM and MPI on the Cray T3E-900 and Cray X1 architectures is presented. The noteworthy feature of the present implementation of the ISLB method is that it is able to achieve a sustained speed of 4.2 Tflop/s while using 504 processors on a Cray X1. The code is shown to achieve super-linear speedups on the Cray T3E-900. It is shown through detailed profiling that the computation and the communication scale well on the Cray X1, although the overall speedup is adversely affected by the cost of barrier synchronization.
Key Words: Shared memory multiprocessors parallel computing SHMEM MPI
International Journal of High Performance Computing Applications, Vol. 20, No. 4,
557-570 (2006) |
|||