Triangular operators
KrylovPreconditioners.TriangularOperator
— FunctionTriangularOperator(A, uplo::Char, diag::Char; nrhs::Int=1, transa::Char='N')
Create a triangular operator for efficient solution of sparse triangular systems on GPU architectures. Supports sparse matrices stored on NVIDIA, AMD, and Intel GPUs.
Input arguments
A
: A sparse matrix on the GPU representing the triangular system to be solved;uplo
: Specifies whether the triangular matrixA
is upper triangular ('U'
) or lower triangular ('L'
);diag
: Indicates whether the diagonal is unit ('U'
) or non-unit ('N'
);nrhs
: Specifies the number of columns for the right-hand side(s). Defaults to 1, corresponding to solving triangular systems with a single vector as the right-hand side;transa
: Determines how the matrixA
is applied during the triangle solves;'N'
for no transposition,'T'
for transpose, and'C'
for conjugate transpose.
Output argument
op
: An instance ofAbstractTriangularOperator
representing the triangular operator for the specified sparse matrix and parameters.
KrylovPreconditioners.update!
— Methodupdate!(op::AbstractTriangularOperator, A)
Update the sparse matrix A
associated with the given AbstractTriangularOperator
without the need to reallocate buffers or repeat the structural analysis phase for detecting parallelism for sparse triangular solves. A
and the operator op
must have the same sparsity pattern, enabling efficient reuse of existing resources.
Input arguments
op
: The triangular operator to update;A
: The new sparse matrix to associate with the operator.
Nvidia GPUs
Sparse matrices have a specific storage on Nvidia GPUs (CuSparseMatrixCSC
, CuSparseMatrixCSR
or CuSparseMatrixCOO
):
using CUDA, CUDA.CUSPARSE
using SparseArrays
using KrylovPreconditioners
if CUDA.functional()
# CPU Arrays
A_cpu = sprand(100, 100, 0.3)
# GPU Arrays
A_csc_gpu = CuSparseMatrixCSC(A_cpu)
A_csr_gpu = CuSparseMatrixCSR(A_cpu)
A_coo_gpu = CuSparseMatrixCOO(A_cpu)
# Triangular operators
op_csc = TriangularOperator(A_csc_gpu; uplo='L', diag='U', nrhs=1, transa='N')
op_csr = TriangularOperator(A_csr_gpu; uplo='U', diag='N', nrhs=1, transa='T')
op_coo = TriangularOperator(A_coo_gpu; uplo='L', diag='N', nrhs=5, transa='N')
end
AMD GPUs
Sparse matrices have a specific storage on AMD GPUs (ROCSparseMatrixCSC
, ROCSparseMatrixCSR
or ROCSparseMatrixCOO
):
using AMDGPU, AMDGPU.rocSPARSE
using SparseArrays
using KrylovPreconditioners
if AMDGPU.functional()
# CPU Arrays
A_cpu = sprand(200, 100, 0.3)
# GPU Arrays
A_csc_gpu = ROCSparseMatrixCSC(A_cpu)
A_csr_gpu = ROCSparseMatrixCSR(A_cpu)
A_coo_gpu = ROCSparseMatrixCOO(A_cpu)
# Triangular operators
op_csc = TriangularOperator(A_csc_gpu; uplo='L', diag='U', nrhs=1, transa='N')
op_csr = TriangularOperator(A_csr_gpu; uplo='L', diag='U', nrhs=1, transa='T')
op_coo = TriangularOperator(A_coo_gpu; uplo='L', diag='U', nrhs=5, transa='N')
end
Intel GPUs
Sparse matrices have a specific storage on Intel GPUs (oneSparseMatrixCSR
):
using oneAPI, oneAPI.oneMKL
using SparseArrays
using KrylovPreconditioners
if oneAPI.functional()
# CPU Arrays
A_cpu = sprand(T, 20, 10, 0.3)
# GPU Arrays
A_csr_gpu = oneSparseMatrixCSR(A_cpu)
# Triangular operator
op_csr = TriangularOperator(A_csr_gpu; uplo='L', diag='U', nrhs=1, transa='N')
end