Copyright Notice:

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Publications of SPCL

L. Gianinazzi, A. Nikolaos Ziogas, P. Luczynski, L. Huang, S. Ashkboos, F. Scheidl, A. Carigiet, C. Ge, N. Abubaker, M. Besta, T. Ben-Nun, T. Hoefler:

 Arrow Matrix Decomposition: A Novel Approach for Communication-Efficient Sparse Matrix Multiplication

(In Proceedings of the 29th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'24), presented in Edinburgh, United Kingdom, pages 404-416, Association for Computing Machinery, Mar. 2024)

Publisher Reference

Abstract

We propose a novel approach to iterated sparse matrix dense matrix multiplication, a fundamental computational kernel in scientific computing and graph neural network training. In cases where matrix sizes exceed the memory of a single compute node, data transfer becomes a bottleneck. An approach based on dense matrix multiplication algorithms leads to suboptimal scalability and fails to exploit the sparsity in the problem. To address these challenges, we propose decomposing the sparse matrix into a small number of highly structured matrices called arrow matrices, which are connected by permutations. Our approach enables communication-avoiding multiplications, achieving a polynomial reduction in communication volume per iteration for matrices corresponding to planar graphs and other minor-excluded families of graphs. Our evaluation demonstrates that our approach outperforms a state-of-the-art method for sparse matrix multiplication on matrices with hundreds of millions of rows, offering near-linear strong and weak scaling.

Documents

download article:
access preprint on arxiv:
 

BibTeX

@inproceedings{gianinazzi2024arrow,
  author={Lukas Gianinazzi and Alexandros Nikolaos Ziogas and Piotr Luczynski and Langwen Huang and Saleh Ashkboos and Florian Scheidl and Armon Carigiet and Chio Ge and Nabil Abubaker and Maciej Besta and Tal Ben-Nun and Torsten Hoefler},
  title={{Arrow Matrix Decomposition: A Novel Approach for Communication-Efficient Sparse Matrix Multiplication}},
  year={2024},
  month={03},
  pages={404-416},
  booktitle={Proceedings of the 29th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'24)},
  location={Edinburgh, United Kingdom},
  publisher={Association for Computing Machinery},
  doi={10.1145/3627535.3638496},
}