20190624_稀疏矩阵存储及计算介绍

文章
[1] Bell N, Garland M. Efficient sparse matrix-vector multiplication on CUDA[R]. Nvidia Technical Report NVR-2008-004, Nvidia Corporation, 2008.
[2] Bell N, Garland M. Implementing sparse matrix-vector multiplication on throughput-oriented processors[C]//Proceedings of the conference on high performance computing networking, storage and analysis. ACM, 2009: 18.
[3] Saad Y. Iterative methods for sparse linear systems[M]. siam, 2003.
相关链接https://www.bu.edu/pasi/materials/,里面有Iterative methods for sparse linear systems on GPU的PPT
机构:NVIDIA

[TOC]

1. Sparse Matrix

大量0元素。


稀疏矩阵

2. Sparse Representations (Static Storage Formats)

2.1. Coordinate (COO)

三个等长数组,表示简单,易于操作。

2.2. Compressed Sparse Row (CSR)

  • COO的row indices有很多重复值:
    -- 记录每一行结束时总共的非0元素个数
    -- 如,想知道第3行的元素,在row offsets中得到4、7,再依次访问c[4] - c[7]、v[4] - v[7]
    -- 该格式比较Popular

2.3. ELLPACK (ELL)

  • 假设:每行非0元素个数有限,且最大为N
    -- #rows x N 空间存储values,每行依次填充
    -- #rows x N 空间存储col indices,与values矩阵一一对应
    -- 适用于每行非0元素个数都较少的情况

2.4. Diagonal (DIA)

  • 假设:矩阵是对角紧凑型的
    -- 沿对角线存储数值
    -- 可根据 offset来确定每一条对角线

2.5. Hybrid (HYB) ELL + COO

  • 假设:有若干行的非0元素个数较多,其余均较少
    -- 大部分用ELL,剩余的用COO

2.6. Else

2.6. Format Comparison

优劣与形状、算法有关

3. Sparse Solvers

3.1. Direct Solvers

  • 求解步骤
    -- 求解:\boldsymbol{\text{A}}\boldsymbol{x}=\boldsymbol{b}
    -- LU分解:\boldsymbol{\text{L}}\boldsymbol{\text{U}}\boldsymbol{x}=\boldsymbol{b}
    -- Forward substitute:\boldsymbol{\text{L}}\boldsymbol{y}=\boldsymbol{b}
    -- Back substitute:\boldsymbol{\text{U}}\boldsymbol{x}=\boldsymbol{y}
  • Sparse matrix factorization
    -- Try to reduce fill-in of factors
    -- Use sophisticated reodering schemes: NP
    -- LU factorization
    -- Cholesky factorization (Symmetric Positive)
  • Popular codes
    -- PARDISO, UMFPACK, SuperLU

3.2. Iterative Solvers

  • Sequence of approximate solutions
    -- Converge to exact solution(会多次运算Ax)
  • Measure error through residual
    -- \text{residual} = \boldsymbol{b} – \boldsymbol{\text{A}}\boldsymbol{x}
    -- \text{residual} = \boldsymbol{\text{A}}(\boldsymbol{x}^{'}-\boldsymbol{x})=\boldsymbol{\text{A}}*\text{error}
    -- Stop when \lVert \boldsymbol{b} – \boldsymbol{\text{A}}\boldsymbol{x} \rVert < \text{tolerance}
  • Wide variety of methods
    -- Jacobi
    -- Richardson
    -- Gauss-Sidel
    -- Conjugate-Gradient (CG)
    -- Generalized Minimum-Residual (GMRES)

3.2.1. Jacobi Iterative Method

Jacobi迭代

3.2.2. Krylov Iterative Methods

还没搞明白。

3.3. Solvers Comparison

Direct Solvers Iterative Solvers
Robust Breakdown issues
Black-box operation: Just USE it! Lots of parameters
Difficult to parallelize Amenable to parallelism
Memory consumption: factorization Low memory footprint
Limited scalability Scalable

4. Sparse Matrix-Vector Multiplication (SpMV)

就是各种GPU算子kernel,详见paper吧。

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容