Advanced Search

Journal Navigation

Journal Home

Subscriptions

Archive

Contact Us

Table of Contents

Click here to sign up for SAGE Journal Email Alerts today!

Sign In to gain access to subscriptions and/or personal tools.
International Journal of High Performance Computing Applications
This Article
Right arrow Free Full Text (Free PDF) Free
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Right arrow Citing Articles via Scopus
Google Scholar
Right arrow Articles by Im, E.-J.
Right arrow Articles by Vuduc, R.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?

Sparsity: Optimization Framework for Sparse Matrix Kernels

Eun-Jin Im

SCHOOL OF COMPUTER SCIENCE KOOKMIN UNIVERSITY, SEOUL, KOREA

Katherine Yelick

Richard Vuduc

COMPUTER SCIENCE DIVISION UNIVERSITY OF CALIFORNIA, BERKELEY, CA, USA

Sparse matrix–vector multiplication is an important computational kernel that performs poorly on most modern processors due to a low compute-to-memory ratio and irregular memory access patterns. Optimization is difficult because of the complexity of cache-based memory systems and because performance is highly dependent on the non-zero structure of the matrix. The SPARSITY system is designed to address these problems by allowing users to automatically build sparse matrix kernels that are tuned to their matrices and machines. SPARSITY combines traditional techniques such as loop transformations with data structure transformations and optimization heuristics that are specific to sparse matrices. It provides a novel framework for selecting optimization parameters, such as block size, using a combination of performance models and search.

In this paper we discuss the optimization of two operations: a sparse matrix times a dense vector and a sparse matrix times a set of dense vectors. Our experience indicates that register level optimizations are effective for matrices arising in certain scientific simulations, in particular finite-element problems. Cache level optimizations are important when the vector used in multiplication is larger than the cache size, especially for matrices in which the non-zero structure is random. For applications involving multiple vectors, reorganizing the computation to perform the entire set of multiplications as a single operation produces significant speedups. We describe the different optimizations and parameter selection techniques and evaluate them on several machines using over 40 matrices taken from a broad set of application domains. Our results demonstrate speedups of up to 4x for the single vector case and up to 10x for the multiple vector case.

Key Words: sparse matrix • performance tuning • memory hierarchy

International Journal of High Performance Computing Applications, Vol. 18, No. 1, 135-158 (2004)
DOI: 10.1177/1094342004041296


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati   Add to Twitter Twitter    What's this?


This article has been cited by other articles:


Home page
International Journal of High Performance Computing ApplicationsHome page
G. Tan, L. Xu, Z. Dai, S. Feng, and N. Sun
Regular Paper: A Study of Architectural Optimization Methods in Bioinformatics Applications
International Journal of High Performance Computing Applications, August 1, 2007; 21(3): 371 - 384.
[Abstract] [PDF]