Sparse Matrices


Dense Matrices

A \(n \times n\) matrix is called dense if it has \(O(n^2)\) non-zero entries. For example:

\[\mathbf{A} = \begin{bmatrix} 1.0 & 2.0 & 3.0 \\ 4.0 & 5.0 & 6.0 \\ 7.0 & 8.0 & 9.0 \end{bmatrix}.\]

To store the matrix, all components are saved in row-major order. For \(\mathbf{A}\) given above, we would store:

\[AA = \begin{bmatrix} 1.0 & 2.0 & 3.0 & 4.0 & 5.0 & 6.0 & 7.0 & 8.0 & 9.0 \end{bmatrix}.\]

The dimensions of the matrix are stored separately.

Sparse Matrices

A \(n \times n\) matrix is called sparse if it has \(O(n)\) non-zero entries. For example:

\[A = \begin{bmatrix} 1.0 & 0 & 0 & 2.0 & 0 \\ 3.0 & 4.0 & 0 & 5.0 & 0 \\ 6.0 & 0 & 7.0 & 8.0 & 9.0 \\ 0 & 0 & 10.0 & 11.0 & 0 \\ 0 & 0 & 0 & 0 & 12.0 \end{bmatrix}.\]

COO (Coordinate Format) stores arrays of row indices, column indices and the corresponding non-zero data values in any order. This format provides fast methods to construct sparse matrices and convert to different sparse formats. For \({\bf A}\) the COO format is:

\[\textrm{data} = \begin{bmatrix} 12.0 & 9.0 & 7.0 & 5.0 & 1.0 & 2.0 & 11.0 & 3.0 & 6.0 & 4.0 & 8.0 & 10.0\end{bmatrix}\] \[\textrm{row} = \begin{bmatrix} 4 & 2 & 2 & 1 & 0 & 0 & 3 & 1 & 2 & 1 & 2 & 3 \end{bmatrix}, \\ \textrm{col} = \begin{bmatrix} 4 & 4 & 2 & 3 & 0 & 3 & 3 & 0 & 0 & 1 & 3 & 2 \end{bmatrix}\]

CSR (Compressed Sparse Row) encodes rows offsets, column indices and the corresponding non-zero data values. This format provides fast arithmetic operations between sparse matrices, and fast matrix vector product. The row offsets are defined by the followign recursive relationship (starting with \(\textrm{rowptr}[0] = 0\)):

\[ \textrm{rowptr}[j] = \textrm{rowptr}[j-1] + \mathrm{nnz}(\textrm{row}_{j-1}), \\ \]

where \(\mathrm{nnz}(\textrm{row}_k)\) is the number of non-zero elements in the \(k^{th}\) row. Note that the length of \(\textrm{rowptr}\) is \(n_{rows} + 1\), where the last element in \(\textrm{rowptr}\) is the number of nonzeros in \(A\). For \({\bf A}\) the CSR format is:

\[\textrm{data} = \begin{bmatrix} 1.0 & 2.0 & 3.0 & 4.0 & 5.0 & 6.0 & 7.0 & 8.0 & 9.0 & 10.0 & 11.0 & 12.0 \end{bmatrix}\] \[\textrm{col} = \begin{bmatrix} 0 & 3 & 0 & 1 & 3 & 0 & 2 & 3 & 4 & 2 & 3 & 4\end{bmatrix}\] \[\textrm{rowptr} = \begin{bmatrix} 0 & 2 & 5 & 9 & 11 & 12 \end{bmatrix}\]

CSR Matrix Vector Product Algorithm

The following code snippet performs CSR matrix vector product for square matrices:

import numpy as np
def csr_mat_vec(A, x):
  Ax = np.zeros_like(x)
  for i in range(x.shape[0]):
    for k in range(A.rowptr[i], A.rowptr[i+1]):
      Ax[i] += A.data[k]*x[A.col[k]]
  return Ax

Review Questions

  1. What does it mean for a matrix to be sparse?
  2. What factors might you consider when deciding how to store a sparse matrix? (Why would you store a matrix in one format over another?)
  3. Given a sparse matrix, put the matrix in CSR format.
  4. Given a sparse matrix, put the matrix in COO format.
  5. For a given matrix, how many bytes total are required to store it in CSR format?
  6. For a given matrix, how many bytes total are required to store it in COO format?

ChangeLog