Triangular operators

KrylovPreconditioners.TriangularOperatorFunction
TriangularOperator(A, uplo::Char, diag::Char; nrhs::Int=1, transa::Char='N')

Create a triangular operator for efficient solution of sparse triangular systems on GPU architectures. Supports sparse matrices stored on NVIDIA, AMD, and Intel GPUs.

Input arguments

  • A: A sparse matrix on the GPU representing the triangular system to be solved;
  • uplo: Specifies whether the triangular matrix A is upper triangular ('U') or lower triangular ('L');
  • diag: Indicates whether the diagonal is unit ('U') or non-unit ('N');
  • nrhs: Specifies the number of columns for the right-hand side(s). Defaults to 1, corresponding to solving triangular systems with a single vector as the right-hand side;
  • transa: Determines how the matrix A is applied during the triangle solves; 'N' for no transposition, 'T' for transpose, and 'C' for conjugate transpose.

Output argument

  • op: An instance of AbstractTriangularOperator representing the triangular operator for the specified sparse matrix and parameters.
source
KrylovPreconditioners.update!Method
update!(op::AbstractTriangularOperator, A)

Update the sparse matrix A associated with the given AbstractTriangularOperator without the need to reallocate buffers or repeat the structural analysis phase for detecting parallelism for sparse triangular solves. A and the operator op must have the same sparsity pattern, enabling efficient reuse of existing resources.

Input arguments

  • op: The triangular operator to update;
  • A: The new sparse matrix to associate with the operator.
source

Nvidia GPUs

Sparse matrices have a specific storage on Nvidia GPUs (CuSparseMatrixCSC, CuSparseMatrixCSR or CuSparseMatrixCOO):

using CUDA, CUDA.CUSPARSE
using SparseArrays
using KrylovPreconditioners

if CUDA.functional()
  # CPU Arrays
  A_cpu = sprand(100, 100, 0.3)

  # GPU Arrays
  A_csc_gpu = CuSparseMatrixCSC(A_cpu)
  A_csr_gpu = CuSparseMatrixCSR(A_cpu)
  A_coo_gpu = CuSparseMatrixCOO(A_cpu)

  # Triangular operators
  op_csc = TriangularOperator(A_csc_gpu; uplo='L', diag='U', nrhs=1, transa='N')
  op_csr = TriangularOperator(A_csr_gpu; uplo='U', diag='N', nrhs=1, transa='T')
  op_coo = TriangularOperator(A_coo_gpu; uplo='L', diag='N', nrhs=5, transa='N')
end

AMD GPUs

Sparse matrices have a specific storage on AMD GPUs (ROCSparseMatrixCSC, ROCSparseMatrixCSR or ROCSparseMatrixCOO):

using AMDGPU, AMDGPU.rocSPARSE
using SparseArrays
using KrylovPreconditioners

if AMDGPU.functional()
  # CPU Arrays
  A_cpu = sprand(200, 100, 0.3)

  # GPU Arrays
  A_csc_gpu = ROCSparseMatrixCSC(A_cpu)
  A_csr_gpu = ROCSparseMatrixCSR(A_cpu)
  A_coo_gpu = ROCSparseMatrixCOO(A_cpu)

  # Triangular operators
  op_csc = TriangularOperator(A_csc_gpu; uplo='L', diag='U', nrhs=1, transa='N')
  op_csr = TriangularOperator(A_csr_gpu; uplo='L', diag='U', nrhs=1, transa='T')
  op_coo = TriangularOperator(A_coo_gpu; uplo='L', diag='U', nrhs=5, transa='N')
end

Intel GPUs

Sparse matrices have a specific storage on Intel GPUs (oneSparseMatrixCSR):

using oneAPI, oneAPI.oneMKL
using SparseArrays
using KrylovPreconditioners

if oneAPI.functional()
  # CPU Arrays
  A_cpu = sprand(T, 20, 10, 0.3)

  # GPU Arrays
  A_csr_gpu = oneSparseMatrixCSR(A_cpu)

  # Triangular operator
  op_csr = TriangularOperator(A_csr_gpu; uplo='L', diag='U', nrhs=1, transa='N')
end