Advanced Search

Journal Navigation

Journal Home

Subscriptions

Archive

Contact Us

Table of Contents

Click here to sign up for SAGE Journal Email Alerts today!

Sign In to gain access to subscriptions and/or personal tools.
International Journal of High Performance Computing Applications
This Article
Right arrow Abstract Freely available
Right arrow Free Full Text (Free PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via Google Scholar
Right arrow Citing Articles via Scopus
Google Scholar
Right arrow Articles by Sellappa, S.
Right arrow Articles by Chatterjee, S.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?

Cache-Efficient Multigrid Algorithms

Sriram Sellappa

ANDIAMO SYSTEMS INC. SAN JOSE, CA 95134, USA

Siddhartha Chatterjee

IBM T. J. WATSON RESEARCH CENTER YORKTOWN HEIGHTS, NY 10598, USA

Multigrid is widely used as an efficient solver for sparse linear systems arising from the discretization of elliptic boundary value problems. Linear relaxation methods such as Gauss–Seidel and Red–Black Gauss–Seidel form the principal computational component of multigrid, and thus affect its efficiency. In the context of multigrid, these iterative solvers are executed for a small number of iterations (2–8). We exploit this property of the algorithm to develop a cache-efficient multigrid method, by focusing on improving the memory behavior of the linear relaxation methods. The efficiency in our cache-efficient linear relaxation algorithm comes from two sources: reducing the number of data cache and TLB misses, and reducing the number of memory references by keeping values registerresident. Our optimizations are applicable to multigrid applied to linear systems arising from constant coefficient elliptic PDEs on structured grids. Experiments on five modern computing platforms show a performance improvement of 1.15–2.7 times over a standard implementation of Full Multigrid V-Cycle.

Key Words: cache • TLB • locality • hierarchical computation

References

  • Bassetti, F., Davis, K., and Quinlan, D. December 1998. Optimizing transformations of stencil operations for parallel object-oriented scientific frameworks on cache-based architectures . In Proceedings of ISCOPE’98, Santa Fe, NM.
  • Bassetti, F., Davis, K., and Marathe, M. December 1999. Improving cache utilization of linear relaxation methods: theory and practice . In Proceedings of ISCOPE’99, San Francisco, CA.
  • Briggs, W. L. 1987. A Multigrid Tutorial, SIAM, Philadelphia, PA .
  • Bromley, M., Heller, S., McNerney, T., and Steele, Jr, G. L. June 1991. Fortran at ten gigaflops: the Connection Machine convolution compiler . In Proceedings of the ACM SIGPLAN’91 Conference on Programming Language Design and Implementation, Toronto, Canada, pp. 145–156 .
  • Chatterjee, S., Jain, V. V., Lebeck, A. R., Mundhra, S., and Thottethodi, M. June 1999. Nonlinear array layouts for hierarchical memory systems . In Proceedings of 1999 ACM International Conference on Supercomputing, Rhodes, Greece, pp. 444–453 .
  • Douglas, C., Hu, J., Kowarschik, M., Rüde, U., and Weiss, C. 2000. Cache optimization for structured and unstructured grid multigrid . Electronic Transactions on Numerical Analysis 10: 21–40 . University of Kentucky, Louisville, KY , ISSN 1068–9613.
  • Frumkin, M. A. and Van Der Wijngaart, R. F. March 2001. Interference lattice-based loop nest tilings for stencil computations . In Proceedings of the 10th SIAM Conference on Parallel Processing for Scientific Computing (CD-ROM), Portsmouth, VA, SIAM, Philadelphia, PA.
  • Hill, M. D. and Smith, A. J. 1989. Evaluating associativity in CPU caches . IEEE Transactions on Computing 38(12): 1612–1620 .[CrossRef]
  • Lam, M. S., Rothberg, E. E., and Wolf, M. E. April 1991. The cache performance and optimizations of blocked algorithms . In Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 63–74 .
  • Lebeck, A. R. and Wood, D. A. 1994. Cache profiling and the SPEC benchmarks: a case study . IEEE Computer 27(10): 15–26 .
  • Leiserson, C. E., Rao, S., and Toledo, S. 1997. Efficient outofcore algorithms for linear relaxation using blocking covers . Journal of Computer and System Science 54(2): 332–344 .[CrossRef]
  • Mitchell, N., Carter, L., Ferrante, J., and Högstedt, K. 1998. Quantifying the multi-level nature of tiling interactions. In Languages and Compilers for Parallel Computing: 10th Annual Workshop, LCPC’97, Lecture Notes in Computer Science Vol. 1366, Springer-Verlag, Berlin , pp. 1–15.
  • Povitsky, A. October 1999. Wavefront cache-friendly algorithm for compact numerical schemes, Technical Report 99-40, ICASE, Hampton, VA.
  • Stals, L. and Rüde, U. October 1997. Techniques for improving the data locality of iterative methods, Technical Report MRR 038-97, Institut für Mathematik, Universität Augsburg, Augsburg, Germany.
  • Wolf, M. E. and Lam, M. S. June 1991. A data locality optimizing algorithm . In Proceedings of the ACM SIGPLAN’91 Conference on Programming Language Design and Implementation, Toronto, Canada, pp. 30–44 .
  • Wolfe, M. November 1989. More iteration space tiling . In Proceedings of Supercomputing'89, Reno, NV, pp. 655–664 .
  • Zagha, M., Larson, B., Turner, S., and Itzkowitz, M. November 1996. Performance analysis using the MIPS R10000 performance counters . In Proceedings of SC96 (CD-ROM), Pittsburgh, PA. Available from http://www.supercomp.org/sc96.

International Journal of High Performance Computing Applications, Vol. 18, No. 1, 115-133 (2004)
DOI: 10.1177/1094342004041295


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati   Add to Twitter Twitter    What's this?



This Article
Right arrow Abstract Freely available
Right arrow Free Full Text (Free PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via Google Scholar
Right arrow Citing Articles via Scopus
Google Scholar
Right arrow Articles by Sellappa, S.
Right arrow Articles by Chatterjee, S.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?