From 0465730b19bb8bd45761868098d21368681f2fa8 Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Mon, 25 Sep 2023 07:58:22 -0500 Subject: [PATCH] build based on dc102f7 --- dev/.documenter-siteinfo.json | 2 +- dev/artifact/index.html | 2 +- dev/index.html | 2 +- dev/lib/autodiff/index.html | 10 +- dev/lib/formulations/index.html | 30 +++--- dev/lib/linearsolver/index.html | 12 +-- dev/lib/powersystem/index.html | 6 +- dev/man/autodiff/index.html | 2 +- dev/man/benchmark/index.html | 2 +- dev/man/formulations/index.html | 2 +- dev/man/linearsolver/index.html | 2 +- dev/man/powersystem/index.html | 2 +- dev/quickstart/index.html | 22 ++--- dev/tutorials/batch_evaluation/index.html | 112 +++++++++++----------- dev/tutorials/direct_solver/index.html | 22 ++--- 15 files changed, 115 insertions(+), 115 deletions(-) diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index c1c1e4e4..899778c0 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.9.3","generation_timestamp":"2023-09-23T19:49:49","documenter_version":"1.0.1"}} \ No newline at end of file +{"documenter":{"julia_version":"1.9.3","generation_timestamp":"2023-09-25T07:58:17","documenter_version":"1.0.1"}} \ No newline at end of file diff --git a/dev/artifact/index.html b/dev/artifact/index.html index 6c6bd931..3df30f4e 100644 --- a/dev/artifact/index.html +++ b/dev/artifact/index.html @@ -1,2 +1,2 @@ -ExaData Artifact · ExaPF.jl

ExaData Artifact

The ExaData artifact contains test cases relevant to the Exascale Computing Project. It is built from the git repository available at ExaData. Apart from the standard MATPOWER files it additionally contains demand scenarios and contingencies used in multiperiod security constrained optimal power flow settings.

+ExaData Artifact · ExaPF.jl

ExaData Artifact

The ExaData artifact contains test cases relevant to the Exascale Computing Project. It is built from the git repository available at ExaData. Apart from the standard MATPOWER files it additionally contains demand scenarios and contingencies used in multiperiod security constrained optimal power flow settings.

diff --git a/dev/index.html b/dev/index.html index bc6a6c17..f3f1cf6a 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -Home · ExaPF.jl

ExaPF

ExaPF.jl is a package to solve the power flow problem on upcoming exascale architectures. The code has been designed to be:

  1. Portable: Targeting exascale architectures implies a focus on graphics processing units (GPUs) as these systems lack substantial computational performance through classical CPUs.
  2. Differentiable: All the expressions implemented in ExaPF are fully compatible with ForwardDiff.jl, and routines are provided to extract first- and second-order derivatives to solve efficiently power flow and optimal power flow problems.

ExaPF implements a vectorized modeler for power systems, which allows to manipulate basic expressions. All expressions are fully differentiable : their first and second-order derivatives can be extracted efficiently using automatic differentiation. In addition, we leverage the packages CUDA.jl and KernelAbstractions.jl to make ExaPF portable across GPU architectures.

Table of contents

Manual

Library

Artifact

Funding

This research was supported by the Exascale Computing Project (17-SC-20-SC), a joint project of the U.S. Department of Energy’s Office of Science and National Nuclear Security Administration, responsible for delivering a capable exascale ecosystem, including software, applications, and hardware technology, to support the nation’s exascale computing imperative.

+Home · ExaPF.jl

ExaPF

ExaPF.jl is a package to solve the power flow problem on upcoming exascale architectures. The code has been designed to be:

  1. Portable: Targeting exascale architectures implies a focus on graphics processing units (GPUs) as these systems lack substantial computational performance through classical CPUs.
  2. Differentiable: All the expressions implemented in ExaPF are fully compatible with ForwardDiff.jl, and routines are provided to extract first- and second-order derivatives to solve efficiently power flow and optimal power flow problems.

ExaPF implements a vectorized modeler for power systems, which allows to manipulate basic expressions. All expressions are fully differentiable : their first and second-order derivatives can be extracted efficiently using automatic differentiation. In addition, we leverage the packages CUDA.jl and KernelAbstractions.jl to make ExaPF portable across GPU architectures.

Table of contents

Manual

Library

Artifact

Funding

This research was supported by the Exascale Computing Project (17-SC-20-SC), a joint project of the U.S. Department of Energy’s Office of Science and National Nuclear Security Administration, responsible for delivering a capable exascale ecosystem, including software, applications, and hardware technology, to support the nation’s exascale computing imperative.

diff --git a/dev/lib/autodiff/index.html b/dev/lib/autodiff/index.html index 18cbaf36..b820d742 100644 --- a/dev/lib/autodiff/index.html +++ b/dev/lib/autodiff/index.html @@ -1,13 +1,13 @@ -AutoDiff · ExaPF.jl

AutoDiff

Variables

Expressions

ExaPF.AutoDiff.AbstractExpressionType
AbstractExpression

Abstract type for differentiable function $f(x)$. Any AbstractExpression implements two functions: a forward mode to evaluate $f(x)$, and an adjoint to evaluate $∂f(x)$.

Forward mode

The direct evaluation of the function $f$ is implemented as

(expr::AbstractExpression)(output::VT, stack::AbstractStack{VT}) where VT<:AbstractArray
+AutoDiff · ExaPF.jl

AutoDiff

Variables

Expressions

ExaPF.AutoDiff.AbstractExpressionType
AbstractExpression

Abstract type for differentiable function $f(x)$. Any AbstractExpression implements two functions: a forward mode to evaluate $f(x)$, and an adjoint to evaluate $∂f(x)$.

Forward mode

The direct evaluation of the function $f$ is implemented as

(expr::AbstractExpression)(output::VT, stack::AbstractStack{VT}) where VT<:AbstractArray
 

the input being specified in stack, the results being stored in the array output.

Reverse mode

The adjoint of the function is specified by the function adjoint!, with the signature:

adjoint!(expr::AbstractExpression, ∂stack::AbstractStack{VT}, stack::AbstractStack{VT}, ̄v::VT) where VT<:AbstractArray
-

The variable stack stores the result of the direct evaluation, and is not modified in adjoint!. The results are stored inside the adjoint stack ∂stack.

source
ExaPF.AutoDiff.adjoint!Function
adjoint!(expr::AbstractExpression, ∂stack::AbstractStack{VT}, stack::AbstractStack{VT}, ̄v::VT) where VT<:AbstractArray

Compute the adjoint of the AbstractExpression expr with relation to the adjoint vector ̄v. The results are stored in the adjoint stack ∂stack. The variable stack stores the result of a previous direct evaluation, and is not modified in adjoint!.

source

First and second-order derivatives

Utils

ExaPF.AutoDiff.seed!Function
seed!(
+

The variable stack stores the result of the direct evaluation, and is not modified in adjoint!. The results are stored inside the adjoint stack ∂stack.

source
ExaPF.AutoDiff.adjoint!Function
adjoint!(expr::AbstractExpression, ∂stack::AbstractStack{VT}, stack::AbstractStack{VT}, ̄v::VT) where VT<:AbstractArray

Compute the adjoint of the AbstractExpression expr with relation to the adjoint vector ̄v. The results are stored in the adjoint stack ∂stack. The variable stack stores the result of a previous direct evaluation, and is not modified in adjoint!.

source

First and second-order derivatives

Utils

ExaPF.AutoDiff.seed!Function
seed!(
     H::AbstractHessianProd,
     v::AbstractVector{T},
-) where {T}

Seed the duals with v to compute the Hessian vector product $λ^⊤ H v$.

source
ExaPF.AutoDiff.seed_coloring!Function
seed_coloring!(
     M::Union{AbstractJacobian, AbstractFullHessian}
     coloring::AbstractVector,
-)

Seed the duals with the coloring based seeds to compute the Jacobian or Hessian $M$.

source
ExaPF.AutoDiff.partials!Function
partials!(jac::AbstractJacobian)

Extract partials from Jacobian jac in jac.J.

source
partials!(hess::AbstractFullHessian)

Extract partials from Hessian hess into hess.H.

source
ExaPF.AutoDiff.partials!Function
partials!(jac::AbstractJacobian)

Extract partials from Jacobian jac in jac.J.

source
partials!(hess::AbstractFullHessian)

Extract partials from Hessian hess into hess.H.

source
ExaPF.AutoDiff.set_value!Function
set_value!(
     jac,
     primals::AbstractVector{T}
-) where {T}

Set values of ForwardDiff.Dual numbers in jac to primals.

source
+) where {T}

Set values of ForwardDiff.Dual numbers in jac to primals.

source
diff --git a/dev/lib/formulations/index.html b/dev/lib/formulations/index.html index 7b8c6bd7..01cfd89d 100644 --- a/dev/lib/formulations/index.html +++ b/dev/lib/formulations/index.html @@ -1,5 +1,5 @@ -Polar formulation · ExaPF.jl

Polar formulation

Generic templates

ExaPF.StateType
State <: AbstractVariable

All variables $x$ depending on the variables Control $u$ through the non-linear equation $g(x, u) = 0$.

source
ExaPF.ControlType
Control <: AbstractVariable

Independent variables $u$ used in the reduced-space formulation.

source

Structure and variables

ExaPF.PolarFormType
PolarForm{T, IT, VT, MT} <: AbstractPolarFormulation

Implement the polar formulation associated to the network's equations.

Wrap a PS.PowerNetwork network to load the data on the target device (CPU() and CUDABackend() are currently supported).

Example

julia> const PS = ExaPF.PowerSystem;
+Polar formulation · ExaPF.jl

Polar formulation

Generic templates

ExaPF.StateType
State <: AbstractVariable

All variables $x$ depending on the variables Control $u$ through the non-linear equation $g(x, u) = 0$.

source
ExaPF.ControlType
Control <: AbstractVariable

Independent variables $u$ used in the reduced-space formulation.

source

Structure and variables

ExaPF.PolarFormType
PolarForm{T, IT, VT, MT} <: AbstractPolarFormulation

Implement the polar formulation associated to the network's equations.

Wrap a PS.PowerNetwork network to load the data on the target device (CPU() and CUDABackend() are currently supported).

Example

julia> const PS = ExaPF.PowerSystem;
 
 julia> network_data = PS.load_case("case9.m");
 
@@ -12,7 +12,7 @@
 giving a mathematical formulation with:
     #controls:   5
     #states  :   14
-
source
ExaPF.BlockPolarFormType
BlockPolarForm{T, IT, VT, MT} <: AbstractFormulation

Block polar formulation: duplicates k different polar models to evaluate them in parallel.

source
ExaPF.load_polarFunction
load_polar(case, device=CPU(); dir=PS.EXADATA)

Load a PolarForm instance from the specified benchmark library dir on the target device (default is CPU). ExaPF uses two different benchmark libraries: MATPOWER (dir=EXADATA) and PGLIB-OPF (dir=PGLIB).

Examples

julia> polar = ExaPF.load_polar("case9")
+
source
ExaPF.BlockPolarFormType
BlockPolarForm{T, IT, VT, MT} <: AbstractFormulation

Block polar formulation: duplicates k different polar models to evaluate them in parallel.

source
ExaPF.load_polarFunction
load_polar(case, device=CPU(); dir=PS.EXADATA)

Load a PolarForm instance from the specified benchmark library dir on the target device (default is CPU). ExaPF uses two different benchmark libraries: MATPOWER (dir=EXADATA) and PGLIB-OPF (dir=PGLIB).

Examples

julia> polar = ExaPF.load_polar("case9")
 Polar formulation (instantiated on device CPU(false))
 Network characteristics:
     #buses:      9  (#slack: 1  #PV: 2  #PQ: 6)
@@ -21,7 +21,7 @@
 giving a mathematical formulation with:
     #controls:   5
     #states  :   14
-
source
ExaPF.NetworkStackType
NetworkStack{VT,VD,MT} <: AbstractNetworkStack{VT}
 NetworkStack(polar::PolarForm)
 NetworkStack(nbus::Int, ngen::Int, nlines::Int, VT::Type)

Store the variables associated to the polar formulation. The variables are stored in the field input, ordered as follows

    input = [vmag ; vang ; pgen]

The object stores also intermediate variables needed in the expression tree, such as the LKMR basis ψ.

Notes

The NetworkStack can be instantiated on the host or on the target device.

Examples

julia> polar = ExaPF.load_polar("case9");
 
@@ -39,7 +39,7 @@
  1.0
  1.0
  1.0
-
source

The state and the control are defined as mapping:

ExaPF.mappingFunction
mapping(polar::PolarForm, ::Control)

Return the mapping associated to the Control() in NetworkStack according to the polar formulation PolarForm.

Examples

julia> polar = ExaPF.load_polar("case9");
+
source

The state and the control are defined as mapping:

ExaPF.mappingFunction
mapping(polar::PolarForm, ::Control)

Return the mapping associated to the Control() in NetworkStack according to the polar formulation PolarForm.

Examples

julia> polar = ExaPF.load_polar("case9");
 
 julia> mapu = ExaPF.mapping(polar, Control())
 5-element Vector{Int64}:
@@ -48,7 +48,7 @@
   3
  20
  21
-
source
mapping(polar::PolarForm, ::State)

Return the mapping associated to the State() in NetworkStack according to the polar formulation PolarForm.

Examples

julia> polar = ExaPF.load_polar("case9");
+
source
mapping(polar::PolarForm, ::State)

Return the mapping associated to the State() in NetworkStack according to the polar formulation PolarForm.

Examples

julia> polar = ExaPF.load_polar("case9");
 
 julia> mapu = ExaPF.mapping(polar, State())
 14-element Vector{Int64}:
@@ -66,7 +66,7 @@
   7
   8
   9
-
source

Powerflow solver

Powerflow solver

ExaPF.run_pfFunction
run_pf(
     polar::PolarForm, stack::NetworkStack;
     rtol=1e-8, max_iter=20, verbose=0,
 )

Solve the power flow equations $g(x, u) = 0$ w.r.t. the stack $x$, using the (NewtonRaphson algorithm. The initial state $x$ is specified implicitly inside stack, with the mapping mapping associated to the polar formulation. The object stack is modified inplace in the function.

The algorithm stops when a tolerance rtol or a maximum number of iterations maxiter is reached.

Arguments

  • polar::AbstractFormulation: formulation of the power flow equation
  • stack::NetworkStack: initial values in the network

Examples

julia> polar = ExaPF.load_polar("case9");
@@ -79,7 +79,7 @@
 true
 
 julia> conv.n_iterations
-4
source
ExaPF.nlsolve!Function
nlsolve!(
     algo::NewtonRaphson,
     jac::Jacobian,
     stack::NetworkStack;
@@ -100,7 +100,7 @@
 true
 
 julia> conv.n_iterations
-4
source
ExaPF.NewtonRaphsonType
NewtonRaphson <: AbstractNonLinearSolver

Newton-Raphson algorithm.

Attributes

  • maxiter::Int (default 20): maximum number of iterations
  • tol::Float64 (default 1e-8): tolerance of the algorithm
  • verbose::Int (default 0): verbosity level
source

Constraints

The different parts of the polar formulation are implemented in the following AbstractExpression:

ExaPF.NewtonRaphsonType
NewtonRaphson <: AbstractNonLinearSolver

Newton-Raphson algorithm.

Attributes

  • maxiter::Int (default 20): maximum number of iterations
  • tol::Float64 (default 1e-8): tolerance of the algorithm
  • verbose::Int (default 0): verbosity level
source

Constraints

The different parts of the polar formulation are implemented in the following AbstractExpression:

ExaPF.PolarBasisType
PolarBasis{VI, MT} <: AbstractExpression
 PolarBasis(polar::AbstractPolarFormulation)

Implement the LKMR nonlinear basis. Takes as input the voltage magnitudes vmag and the voltage angles vang and returns

\[ \begin{aligned} & \psi_\ell^C(v, \theta) = v^f v^t \cos(\theta_f - \theta_t) \quad \forall \ell = 1, \cdots, n_\ell \\ & \psi_\ell^S(v, \theta) = v^f v^t \sin(\theta_f - \theta_t) \quad \forall \ell = 1, \cdots, n_\ell \\ @@ -134,7 +134,7 @@ 1.0 1.0 1.0 -

source
ExaPF.PowerFlowBalanceType
PowerFlowBalance{VT, MT}
 PowerFlowBalance(polar)

Implement a subset of the power injection corresponding to $(p_{inj}^{pv}, p_{inj}^{pq}, q_{inj}^{pq})$. The function encodes the active balance equations at PV and PQ nodes, and the reactive balance equations at PQ nodes:

\[\begin{aligned} p_i &= v_i \sum_{j}^{n} v_j (g_{ij}\cos{(\theta_i - \theta_j)} + b_{ij}\sin{(\theta_i - \theta_j})) \,, & ∀ i ∈ \{PV, PQ\} \\ @@ -167,7 +167,7 @@ julia> isapprox(powerflow(stack), zeros(14); atol=1e-8) true -

source
ExaPF.VoltageMagnitudeBoundsType
VoltageMagnitudeBounds

Implement the bounds on voltage magnitudes not taken into account in the bound constraints. In the reduced space, this is associated to the the voltage magnitudes at PQ nodes:

\[v_{pq}^♭ ≤ v_{pq} ≤ v_{pq}^♯ .\]

Dimension: n_pq

Complexity

1 copyto

Note

In the reduced space, the constraints on the voltage magnitudes at PV nodes $v_{pv}$ are taken into account when bounding the control $u$.

Examples

julia> polar = ExaPF.load_polar("case9");
+
source
ExaPF.VoltageMagnitudeBoundsType
VoltageMagnitudeBounds

Implement the bounds on voltage magnitudes not taken into account in the bound constraints. In the reduced space, this is associated to the the voltage magnitudes at PQ nodes:

\[v_{pq}^♭ ≤ v_{pq} ≤ v_{pq}^♯ .\]

Dimension: n_pq

Complexity

1 copyto

Note

In the reduced space, the constraints on the voltage magnitudes at PV nodes $v_{pv}$ are taken into account when bounding the control $u$.

Examples

julia> polar = ExaPF.load_polar("case9");
 
 julia> stack = ExaPF.NetworkStack(polar);
 
@@ -181,7 +181,7 @@
  1.0
  1.0
  1.0
-
source
ExaPF.PowerGenerationBoundsType
PowerGenerationBounds{VT, MT}
 PowerGenerationBounds(polar)

Constraints on the active power productions and on the reactive power productions that are not already taken into account in the bound constraints. In the reduced space, that amounts to

\[p_{g,ref}^♭ ≤ p_{g,ref} ≤ p_{g,ref}^♯ ; C_g q_g^♭ ≤ C_g q_g ≤ C_g q_g^♯ .\]

Require composition with PolarBasis.

Dimension: n_pv + 2 n_ref

Complexity

1 copyto, 1 SpMV

Examples

julia> polar = ExaPF.load_polar("case9");
 
@@ -197,7 +197,7 @@
   0.24069
   0.144601
  -0.03649
-
source
ExaPF.LineFlowsType
LineFlows{VT, MT}
 LineFlows(polar)

Implement thermal limit constraints on the lines of the network.

Require composition with PolarBasis.

Dimension: 2 * n_lines

Complexity

4 SpMV, 4 * n_lines quadratic, 2 * n_lines add

Examples

julia> polar = ExaPF.load_polar("case9");
 
 julia> stack = ExaPF.NetworkStack(polar);
@@ -226,7 +226,7 @@
  2.67781
  0.726668
  0.215497
-
source

Objective

The production costs is given in the AbstractExpression CostFunction:

Objective

The production costs is given in the AbstractExpression CostFunction:

ExaPF.CostFunctionType
CostFunction{VT, MT} <: AutoDiff.AbstractExpression
 CostFunction(polar)

Implement the quadratic cost function for OPF

\[ ∑_{g=1}^{n_g} c_{2,g} p_g^2 + c_{1,g} p_g + c_{0,g}\]

Require composition with PolarBasis to evaluate the cost of the reference generator.

Dimension: 1

Complexity

1 SpMV, 1 sum

Examples

julia> polar = ExaPF.load_polar("case9");
 
 julia> stack = ExaPF.NetworkStack(polar);
@@ -236,5 +236,5 @@
 julia> cost(stack)
 1-element Vector{Float64}:
  4509.0275
-
source

Composition of expressions

The different expressions can be combined together in several different ways.

ExaPF.MultiExpressionsType
MultiExpressions <: AbstractExpression

Implement expressions concatenation. Takes as input a vector of expressions [expr1,...,exprN] and concatenate them in a single expression mexpr, such that

    mexpr(x) = [expr1(x) ; expr2(x) ; ... ; exprN(x)]
-
source
ExaPF.ComposedExpressionsType
ComposedExpressions{Expr1<:PolarBasis, Expr2} <: AbstractExpression

Implement expression composition. Takes as input two expressions expr1 and expr2 and returns a composed expression cexpr such that ``` cexpr(x) = expr2 ∘ expr1(x)

Notes

Currently, only PolarBasis is supported for expr1.

source
+source

Composition of expressions

The different expressions can be combined together in several different ways.

ExaPF.MultiExpressionsType
MultiExpressions <: AbstractExpression

Implement expressions concatenation. Takes as input a vector of expressions [expr1,...,exprN] and concatenate them in a single expression mexpr, such that

    mexpr(x) = [expr1(x) ; expr2(x) ; ... ; exprN(x)]
+
source
ExaPF.ComposedExpressionsType
ComposedExpressions{Expr1<:PolarBasis, Expr2} <: AbstractExpression

Implement expression composition. Takes as input two expressions expr1 and expr2 and returns a composed expression cexpr such that ``` cexpr(x) = expr2 ∘ expr1(x)

Notes

Currently, only PolarBasis is supported for expr1.

source
diff --git a/dev/lib/linearsolver/index.html b/dev/lib/linearsolver/index.html index be2c42b7..4b14d231 100644 --- a/dev/lib/linearsolver/index.html +++ b/dev/lib/linearsolver/index.html @@ -1,11 +1,11 @@ Linear Solvers · ExaPF.jl

Linear solvers

Description

ExaPF allows to solve linear systems with either direct and indirect linear algebra, both on CPU and on GPU. To solve a linear system $Ax = b$, ExaPF uses the function ldiv!.

LinearAlgebra.ldiv!Function
ldiv!(solver, x, A, y)
-ldiv!(solver, x, y)
  • solver::AbstractLinearSolver: linear solver to solve the system
  • x::AbstractVector: Solution
  • A::AbstractMatrix: Input matrix
  • y::AbstractVector: RHS

Solve the linear system $A x = y$ using the algorithm specified in solver. If A is not specified, the function will used directly the factorization stored inside solver.

source

Direct solvers

ExaPF wraps UMFPACK (shipped with SuiteSparse.jl) on the CPU, and CUSPARSE on CUDA device.

ExaPF.LinearSolvers.DirectSolverType
DirectSolver <: AbstractLinearSolver

Solve linear system $A x = y$ with direct linear algebra.

  • On the CPU, DirectSolver uses UMFPACK to solve the linear system
  • On CUDA GPU, DirectSolver redirects the resolution to the method CUSOLVER.csrlsvqr
source

Iterative solvers

ExaPF.LinearSolvers.KrylovBICGSTABType
KrylovBICGSTAB <: AbstractIterativeLinearSolver
-KrylovBICGSTAB(precond; verbose=0, rtol=1e-10, atol=1e-10)

Wrap Krylov.jl's BICGSTAB algorithm to solve iteratively the linear system $A x = y$.

source
ExaPF.LinearSolvers.DQGMRESType
DQGMRES <: AbstractIterativeLinearSolver
-DQGMRES(precond; verbose=false, memory=4)

Wrap Krylov.jl's DQGMRES algorithm to solve iteratively the linear system $A x = y$.

source
ExaPF.LinearSolvers.BICGSTABType
BICGSTAB <: AbstractIterativeLinearSolver
-BICGSTAB(precond; maxiter=2_000, tol=1e-8, verbose=false)

Custom BICGSTAB implementation to solve iteratively the linear system $A x = y$.

source
ExaPF.LinearSolvers.EigenBICGSTABType
EigenBICGSTAB <: AbstractIterativeLinearSolver
-EigenBICGSTAB(precond; maxiter=2_000, tol=1e-8, verbose=false)

Julia's port of Eigen's BICGSTAB to solve iteratively the linear system $A x = y$.

source

ExaPF.jl is shipped with a custom BICGSTAB implementation. However, we highly recommend to use KrylovBICGSTAB instead, which has proved to be more robust.

ExaPF.LinearSolvers.bicgstabFunction
bicgstab(A, b, P, xi;
+ldiv!(solver, x, y)
  • solver::AbstractLinearSolver: linear solver to solve the system
  • x::AbstractVector: Solution
  • A::AbstractMatrix: Input matrix
  • y::AbstractVector: RHS

Solve the linear system $A x = y$ using the algorithm specified in solver. If A is not specified, the function will used directly the factorization stored inside solver.

source

Direct solvers

ExaPF wraps UMFPACK (shipped with SuiteSparse.jl) on the CPU, and CUSPARSE on CUDA device.

ExaPF.LinearSolvers.DirectSolverType
DirectSolver <: AbstractLinearSolver

Solve linear system $A x = y$ with direct linear algebra.

  • On the CPU, DirectSolver uses UMFPACK to solve the linear system
  • On CUDA GPU, DirectSolver redirects the resolution to the method CUSOLVER.csrlsvqr
source

Iterative solvers

ExaPF.LinearSolvers.KrylovBICGSTABType
KrylovBICGSTAB <: AbstractIterativeLinearSolver
+KrylovBICGSTAB(precond; verbose=0, rtol=1e-10, atol=1e-10)

Wrap Krylov.jl's BICGSTAB algorithm to solve iteratively the linear system $A x = y$.

source
ExaPF.LinearSolvers.DQGMRESType
DQGMRES <: AbstractIterativeLinearSolver
+DQGMRES(precond; verbose=false, memory=4)

Wrap Krylov.jl's DQGMRES algorithm to solve iteratively the linear system $A x = y$.

source
ExaPF.LinearSolvers.BICGSTABType
BICGSTAB <: AbstractIterativeLinearSolver
+BICGSTAB(precond; maxiter=2_000, tol=1e-8, verbose=false)

Custom BICGSTAB implementation to solve iteratively the linear system $A x = y$.

source
ExaPF.LinearSolvers.EigenBICGSTABType
EigenBICGSTAB <: AbstractIterativeLinearSolver
+EigenBICGSTAB(precond; maxiter=2_000, tol=1e-8, verbose=false)

Julia's port of Eigen's BICGSTAB to solve iteratively the linear system $A x = y$.

source

ExaPF.jl is shipped with a custom BICGSTAB implementation. However, we highly recommend to use KrylovBICGSTAB instead, which has proved to be more robust.

ExaPF.LinearSolvers.bicgstabFunction
bicgstab(A, b, P, xi;
          tol=1e-8,
          maxiter=size(A, 1),
          verbose=false,
-         maxtol=1e20)

BiCGSTAB implementation according to

Van der Vorst, Henk A. "Bi-CGSTAB: A fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems." SIAM Journal on scientific and Statistical Computing 13, no. 2 (1992): 631-644.

source

Available linear solvers can be queried with

Preconditioning

To solve linear systems with iterative methods, ExaPF provides an implementation of a block-Jacobi preconditioner, portable on GPU.

Block-Jacobi preconditioner

ExaPF.LinearSolvers.BlockJacobiPreconditionerType
BlockJacobiPreconditioner

Overlapping-Schwarz preconditioner.

Attributes

  • nblocks::Int64: Number of partitions or blocks.
  • blocksize::Int64: Size of each block.
  • partitions::Vector{Vector{Int64}}:npart` partitions stored as lists
  • cupartitions: partitions transfered to the GPU
  • lpartitions::Vector{Int64}`: Length of each partitions.
  • culpartitions::Vector{Int64}`: Length of each partitions, on the GPU.
  • blocks: Dense blocks of the block-Jacobi
  • cublocks: Js transfered to the GPU
  • map: The partitions as a mapping to construct views
  • cumap: cumap transferred to the GPU`
  • part: Partitioning as output by Metis
  • cupart: part transferred to the GPU
source
ExaPF.LinearSolvers.updateFunction
function update(J::CuSparseMatrixCSR, p)

Update the preconditioner p from the sparse Jacobian J in CSR format for the GPU

  1. The dense blocks cuJs are filled from the sparse Jacobian J
  2. To a batch inversion of the dense blocks using CUBLAS
  3. Extract the preconditioner matrix p.P from the dense blocks cuJs
source
function update(J::SparseMatrixCSC, p)

Update the preconditioner p from the sparse Jacobian J in CSC format for the CPU

Note that this implements the same algorithm as for the GPU and becomes very slow on CPU with growing number of blocks.

source
+ maxtol=1e20)

BiCGSTAB implementation according to

Van der Vorst, Henk A. "Bi-CGSTAB: A fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems." SIAM Journal on scientific and Statistical Computing 13, no. 2 (1992): 631-644.

source

Available linear solvers can be queried with

ExaPF.LinearSolvers.list_solversFunction
list_solvers(::KernelAbstractions.Device)

List linear solvers available on current device.

source

Preconditioning

To solve linear systems with iterative methods, ExaPF provides an implementation of a block-Jacobi preconditioner, portable on GPU.

ExaPF.LinearSolvers.AbstractPreconditionerType
AbstractPreconditioner

Preconditioners for the iterative solvers mostly focused on GPUs

source

Block-Jacobi preconditioner

ExaPF.LinearSolvers.BlockJacobiPreconditionerType
BlockJacobiPreconditioner

Overlapping-Schwarz preconditioner.

Attributes

  • nblocks::Int64: Number of partitions or blocks.
  • blocksize::Int64: Size of each block.
  • partitions::Vector{Vector{Int64}}:npart` partitions stored as lists
  • cupartitions: partitions transfered to the GPU
  • lpartitions::Vector{Int64}`: Length of each partitions.
  • culpartitions::Vector{Int64}`: Length of each partitions, on the GPU.
  • blocks: Dense blocks of the block-Jacobi
  • cublocks: Js transfered to the GPU
  • map: The partitions as a mapping to construct views
  • cumap: cumap transferred to the GPU`
  • part: Partitioning as output by Metis
  • cupart: part transferred to the GPU
source
ExaPF.LinearSolvers.updateFunction
function update(J::CuSparseMatrixCSR, p)

Update the preconditioner p from the sparse Jacobian J in CSR format for the GPU

  1. The dense blocks cuJs are filled from the sparse Jacobian J
  2. To a batch inversion of the dense blocks using CUBLAS
  3. Extract the preconditioner matrix p.P from the dense blocks cuJs
source
function update(J::SparseMatrixCSC, p)

Update the preconditioner p from the sparse Jacobian J in CSC format for the CPU

Note that this implements the same algorithm as for the GPU and becomes very slow on CPU with growing number of blocks.

source
diff --git a/dev/lib/powersystem/index.html b/dev/lib/powersystem/index.html index 81720c57..3dfe2690 100644 --- a/dev/lib/powersystem/index.html +++ b/dev/lib/powersystem/index.html @@ -1,5 +1,5 @@ -PowerSystem · ExaPF.jl

PowerSystem

Description

ExaPF.PowerSystem.AbstractPowerSystemType
AbstractPowerSystem

First layer of the package. Store the topology of a given transmission network, including:

  • the power injection at each bus ;
  • the admittance matrix ;
  • the default voltage at each bus.

Data are imported either from a matpower file, or a PSSE file.

source
ExaPF.PowerSystem.PowerNetworkType
PowerNetwork <: AbstractPowerSystem

This structure contains constant parameters that define the topology and physics of the power network.

The object PowerNetwork uses its own contiguous indexing for the buses. The indexing is independent from those specified in the Matpower or the PSSE input file. However, a correspondence between the two indexing (Input indexing to PowerNetwork indexing) is stored inside the attribute bus_to_indexes.

Note

The object PowerNetwork is created in the host memory. Use a AbstractFormulation object to move data to the target device.

source
ExaPF.PowerSystem.load_caseFunction
load_case(case_name, lib::PowerNetworkLibrary=EXADATA)

Convenient function to load a PowerNetwork instance from one of the benchmark library (dir=EXADATA for MATPOWER, dir=PGLIB for PGLIB-OPF). Default library is lib=EXADATA.

Examples

julia> net = PS.load_case("case118") # default is MATPOWER
+PowerSystem · ExaPF.jl

PowerSystem

Description

ExaPF.PowerSystem.AbstractPowerSystemType
AbstractPowerSystem

First layer of the package. Store the topology of a given transmission network, including:

  • the power injection at each bus ;
  • the admittance matrix ;
  • the default voltage at each bus.

Data are imported either from a matpower file, or a PSSE file.

source
ExaPF.PowerSystem.PowerNetworkType
PowerNetwork <: AbstractPowerSystem

This structure contains constant parameters that define the topology and physics of the power network.

The object PowerNetwork uses its own contiguous indexing for the buses. The indexing is independent from those specified in the Matpower or the PSSE input file. However, a correspondence between the two indexing (Input indexing to PowerNetwork indexing) is stored inside the attribute bus_to_indexes.

Note

The object PowerNetwork is created in the host memory. Use a AbstractFormulation object to move data to the target device.

source
ExaPF.PowerSystem.load_caseFunction
load_case(case_name, lib::PowerNetworkLibrary=EXADATA)

Convenient function to load a PowerNetwork instance from one of the benchmark library (dir=EXADATA for MATPOWER, dir=PGLIB for PGLIB-OPF). Default library is lib=EXADATA.

Examples

julia> net = PS.load_case("case118") # default is MATPOWER
 PowerNetwork object with:
     Buses: 118 (Slack: 1. PV: 53. PQ: 64)
     Generators: 54.
@@ -7,6 +7,6 @@
 julia> net = PS.load_case("case1354_pegase", PS.PGLIB)
 PowerNetwork object with:
     Buses: 1354 (Slack: 1. PV: 259. PQ: 1094)
-    Generators: 260.
source

API Reference

Network elements

ExaPF.PowerSystem.AbstractNetworkElementType
AbstractNetworkElement

Abstraction for all physical elements being parts of a AbstractPowerSystem. Elements are divided in

  • transmission lines (Lines)
  • buses (Buses)
  • generators (Generators)
source

List of elements:

Network attributes

List of attributes:

Query the indexing of the different elements in a given network:

Network values

List of values:

Function to get the range of a given value:

ExaPF.PowerSystem.boundsFunction
bounds(pf::AbstractPowerSystem, attr::AbstractNetworkAttribute, val::AbstractNetworkValues)

Return lower and upper bounds corresponding to the admissible values of the AbstractNetworkAttribute attr.

Examples

p_min, p_max = bounds(pf, Generator(), ActivePower())
+    Generators: 260.
source

API Reference

Network elements

ExaPF.PowerSystem.AbstractNetworkElementType
AbstractNetworkElement

Abstraction for all physical elements being parts of a AbstractPowerSystem. Elements are divided in

  • transmission lines (Lines)
  • buses (Buses)
  • generators (Generators)
source

List of elements:

Network attributes

List of attributes:

Query the indexing of the different elements in a given network:

Network values

List of values:

Function to get the range of a given value:

ExaPF.PowerSystem.boundsFunction
bounds(pf::AbstractPowerSystem, attr::AbstractNetworkAttribute, val::AbstractNetworkValues)

Return lower and upper bounds corresponding to the admissible values of the AbstractNetworkAttribute attr.

Examples

p_min, p_max = bounds(pf, Generator(), ActivePower())
 v_min, v_max = bounds(pf, Buses(), VoltageMagnitude())
-
source
+
source
diff --git a/dev/man/autodiff/index.html b/dev/man/autodiff/index.html index a9d2f198..6428ac30 100644 --- a/dev/man/autodiff/index.html +++ b/dev/man/autodiff/index.html @@ -35,4 +35,4 @@ F[npq + i] += coef_cos * sin_val - coef_sin * cos_val end end -end

These two abstractions are a powerful tool that allow us to implement the forward mode in vectorized form where the number of directions or tangent components of a tangent variable are the number of Jacobian colors. We illustrate this in the figure below with a point-wise vector product x .* y

SIMD AD for point-wise vector product \\label{fig:simd}

This natural way of computing the compressed Jacobian yields a very high performing code that is portable to any vector architecture, given that a similar package like CUDA.jl exists. We note that similar packages for the Intel Compute Engine and AMD ROCm are in development called oneAPI.jl and AMDGPU.jl, respectively. We expect our package to be portable to AMD and Intel GPUs in the future.

+end

These two abstractions are a powerful tool that allow us to implement the forward mode in vectorized form where the number of directions or tangent components of a tangent variable are the number of Jacobian colors. We illustrate this in the figure below with a point-wise vector product x .* y

SIMD AD for point-wise vector product \\label{fig:simd}

This natural way of computing the compressed Jacobian yields a very high performing code that is portable to any vector architecture, given that a similar package like CUDA.jl exists. We note that similar packages for the Intel Compute Engine and AMD ROCm are in development called oneAPI.jl and AMDGPU.jl, respectively. We expect our package to be portable to AMD and Intel GPUs in the future.

diff --git a/dev/man/benchmark/index.html b/dev/man/benchmark/index.html index 4cf8579c..4f1734bd 100644 --- a/dev/man/benchmark/index.html +++ b/dev/man/benchmark/index.html @@ -2,4 +2,4 @@ Benchmark · ExaPF.jl

Benchmark

For the purpose of performance regression testing, ExaPF provides a lightweight benchmark script. It allows to test the various configurations for the linear solvers used in the Newton-Raphson algorithm, and run them on a specific hardware. The main julia script benchmark/benchmarks.jl takes all its options from the command line. The benchmark script takes as input a linear solver (e.g. KrylovBICGSTAB), a target architecture as a KernelAbstractions object (CPU or CUDABackend), and a case filename which is included in the ExaData artifact. An exhaustive list of all available linear solvers can be obtained via ExaPF.LinearSolvers.list_solvers.

Running

julia --project benchmark/benchmarks.jl KrylovBICGSTAB CUDABackend case300.m

yields

KrylovBICGSTAB, CUDABackend, case300.m,  69.0,  3.57,  43.7, true

The first three fields are the settings of the benchmark run. They are followed by three timings in milliseconds:

  1. the time taken by the Newton-Raphson algorithm to solve the power flow,
  2. the timings for the Jacobian accumulation using AutoDiff,
  3. and the time for the linear solver, including the preconditioner.

To acquire these timings the code is run three times to avoid any precompilation effects. The last field confirms the Newton-Raphson convergence. In case more verbose output is desired, one has to manually set the verbosity in benchmark/benchmarks.jl by changing

powerflow_solver = NewtonRaphson(tol=ntol)

to one of the following options:

powerflow_solver = NewtonRaphson(tol=ntol, verbose=VERBOSE_LEVEL_NONE)
 powerflow_solver = NewtonRaphson(tol=ntol, verbose=VERBOSE_LEVEL_LOW)
 powerflow_solver = NewtonRaphson(tol=ntol, verbose=VERBOSE_LEVEL_MEDIUM)
-powerflow_solver = NewtonRaphson(tol=ntol, verbose=VERBOSE_LEVEL_HIGH)

A shell script benchmark/benchmarks.sh is provided to gather timings with various canonical configurations and storing them in a file cpu_REV.log and gpu_REF.log, where REV is the sha1 hash of the current checked out ExaPF version.

+powerflow_solver = NewtonRaphson(tol=ntol, verbose=VERBOSE_LEVEL_HIGH)

A shell script benchmark/benchmarks.sh is provided to gather timings with various canonical configurations and storing them in a file cpu_REV.log and gpu_REF.log, where REV is the sha1 hash of the current checked out ExaPF version.

diff --git a/dev/man/formulations/index.html b/dev/man/formulations/index.html index a79b79af..24311ddb 100644 --- a/dev/man/formulations/index.html +++ b/dev/man/formulations/index.html @@ -173,4 +173,4 @@ 0.0 0.02340899999999987 0.007743999999999858 - + diff --git a/dev/man/linearsolver/index.html b/dev/man/linearsolver/index.html index 9def6693..1501a6d1 100644 --- a/dev/man/linearsolver/index.html +++ b/dev/man/linearsolver/index.html @@ -1,3 +1,3 @@ Linear Solvers · ExaPF.jl

Linear Solver

Overview

As mentioned before, a linear solver is required to compute the Newton step in

dx .= jacobian(x)\f(x)

Our package supports the following linear solvers:

  • cuSOLVER with csrlsvqr (GPU),
  • Krylov.jl with dqgmres and bicgstab (CPU/GPU),
  • UMFPACK through the default Julia \ operator (CPU),
  • generic BiCGSTAB implementation [Vorst1992] (CPU/GPU),
  • or any linear solver wrapped in LinearAlgebra.

Preconditioning

Using only an iterative solver leads to divergence and bad performance due to ill-conditioning of the Jacobian. This is a known phenomenon in power systems. That's why this package comes with a block Jacobi preconditioner that is tailored towards GPUs and is proven to work well with power flow problems.

The Jacobian is partitioned into a dense block diagonal structure using Metis.jl, where each block is inverted to build our preconditioner P.

Dense block Jacobi preconditioner \\label{fig:preconditioner}

Compared to incomplete Cholesky and incomplete LU this preconditioner is easily portable to the GPU if the number of blocks is high enough. ExaPF.jl uses the batch BLAS calls from cuBLAS to invert the single blocks.

CUDA.@sync pivot, info = CUDA.CUBLAS.getrf_batched!(blocks, true)
-CUDA.@sync pivot, info, p.cuJs = CUDA.CUBLAS.getri_batched(blocks, pivot)

Assuming that other vendors will provide such batched BLAS APIs, this code is portable to other GPU architectures.

  • Vorst1992Vorst, H. A. van der. 1992. “Bi-Cgstab: A Fast and Smoothly Converging Variant of Bi-Cg for the Solution of Nonsymmetric Linear Systems.”SIAM Journal on Scientific and Statistical Computing 13 (2): 631–44
+CUDA.@sync pivot, info, p.cuJs = CUDA.CUBLAS.getri_batched(blocks, pivot)

Assuming that other vendors will provide such batched BLAS APIs, this code is portable to other GPU architectures.

diff --git a/dev/man/powersystem/index.html b/dev/man/powersystem/index.html index 17eaa734..9dd6bdab 100644 --- a/dev/man/powersystem/index.html +++ b/dev/man/powersystem/index.html @@ -16,4 +16,4 @@ 2 julia> PS.get(ps, PS.NumberOfSlackBuses()) -1 +1 diff --git a/dev/quickstart/index.html b/dev/quickstart/index.html index d0071635..ca68cf36 100644 --- a/dev/quickstart/index.html +++ b/dev/quickstart/index.html @@ -18,9 +18,9 @@ #it 5: 1.52178e-11 Power flow has converged: true * #iterations: 5 - * Time Jacobian (s) ........: 0.4158 - * Time linear solver (s) ...: 0.0410 - * Time total (s) ...........: 0.8467

Implicitly, ExaPF has just proceed to the following operations:

This compact syntax allows to solve quickly any powerflow equations in a few lines a code. However, in most case, the user may want more coarse grained control on the different objects manipulated.

Detailed version

In what follows, we detail step by step the detailed procedure to solve the powerflow equations.

How to load a MATPOWER instance as a PowerNetwork object?

We start by importing a MATPOWER instance to a ExaPF.PowerSystem.PowerNetwork object:

julia> pf = PS.PowerNetwork(datafile)PowerNetwork object with:
+  * Time Jacobian (s) ........: 0.4177
+  * Time linear solver (s) ...: 0.0173
+  * Time total (s) ...........: 0.8200

Implicitly, ExaPF has just proceed to the following operations:

This compact syntax allows to solve quickly any powerflow equations in a few lines a code. However, in most case, the user may want more coarse grained control on the different objects manipulated.

Detailed version

In what follows, we detail step by step the detailed procedure to solve the powerflow equations.

How to load a MATPOWER instance as a PowerNetwork object?

We start by importing a MATPOWER instance to a ExaPF.PowerSystem.PowerNetwork object:

julia> pf = PS.PowerNetwork(datafile)PowerNetwork object with:
     Buses: 1354 (Slack: 1. PV: 259. PQ: 1094)
     Generators: 260.

The different fields of the object pf specify the characteristics of the network. For instance, we can retrieve the number of buses or get the indexing of the PV buses with

julia> nbus = PS.get(pf, PS.NumberOfBuses())1354
julia> pv_indexes = pf.pv;

However, a ExaPF.PowerSystem.PowerNetwork object stores only the physical attributes of the network. To choose a given mathematical formulation, we need to pass the object pf to an ExaPF.AbstractFormulation layer. Currently, only the polar formulation is provided with the ExaPF.PolarForm structure. In the future, other formulations (e.g. RectangularForm) may be implemented as well.

How to solve the powerflow equations?

To solve the powerflow equations, we need to choose a given mathematical formulation for the equations of the network. To each formulation corresponds a given state $x$ and control $u$. Using polar representation of the voltage vector, such as $\bm{v} = |v|e^{j \theta}$, each bus $i=1, \cdots, N_B$ must satisfy the power balance equations:

\[\begin{aligned} p_i &= v_i \sum_{j}^{n} v_j (g_{ij}\cos{(\theta_i - \theta_j)} + b_{ij}\sin{(\theta_i - \theta_j})) \,, \\ @@ -61,9 +61,9 @@ #it 5: 1.52178e-11 Power flow has converged: true * #iterations: 5 - * Time Jacobian (s) ........: 0.0048 - * Time linear solver (s) ...: 0.0398 - * Time total (s) ...........: 0.0449

Here, the algorithm solves the powerflow equations in 5 iterations. The algorithm modifies the values of stack inplace, to avoid any unnecessary memory allocations.

How to deport the computation on the GPU?

Now, how can we deport the resolution on the GPU? The procedure looks exactly the same. It suffices to initiate a new ExaPF.PolarForm object, but on the GPU:

julia> polar_gpu = ExaPF.PolarForm(pf, CUDABackend())Polar formulation (instantiated on device CUDA.CUDAKernels.CUDABackend(false, false))
+  * Time Jacobian (s) ........: 0.0045
+  * Time linear solver (s) ...: 0.0315
+  * Time total (s) ...........: 0.0363

Here, the algorithm solves the powerflow equations in 5 iterations. The algorithm modifies the values of stack inplace, to avoid any unnecessary memory allocations.

How to deport the computation on the GPU?

Now, how can we deport the resolution on the GPU? The procedure looks exactly the same. It suffices to initiate a new ExaPF.PolarForm object, but on the GPU:

julia> polar_gpu = ExaPF.PolarForm(pf, CUDABackend())Polar formulation (instantiated on device CUDA.CUDAKernels.CUDABackend(false, false))
 Network characteristics:
     #buses:      1354  (#slack: 1  #PV: 259  #PQ: 1094)
     #generators: 260
@@ -79,9 +79,9 @@
 #it 5: 9.96622e-12
 Power flow has converged: true
   * #iterations: 5
-  * Time Jacobian (s) ........: 2.7291
-  * Time linear solver (s) ...: 0.3145
-  * Time total (s) ...........: 3.9130

Note that we get exactly the same iterations as when we solve the power flow equations on the CPU.

How to solve the linear system with BICGSTAB?

By default, the algorithm runs with a direct solver, which might be inefficient for large problems. To overcome this issue, ExaPF implements a wrapper for different iterative algorithms (GMRES, BICGSTAB).

The performance of iterative solvers is usually improved if we use a preconditioner. ExaPF.jl implements an overlapping Schwarz preconditioner, tailored for GPU usage. To build an instance with 8 blocks, just write

julia> npartitions = 8;
julia> jac_gpu = jx_gpu.J;
julia> precond = LS.BlockJacobiPreconditioner(jac_gpu, npartitions, CUDABackend());

You can attach the preconditioner to an BICGSTAB algorithm simply as

julia> linear_solver = ExaPF.KrylovBICGSTAB(jac_gpu; P=precond);

(this will use the BICGSTAB algorithm implemented in Krylov.jl).

We need to update accordingly the tolerance of the Newton-Raphson algorithm (the iterative solver is less accurate than the direct solver):

julia> pf_algo = NewtonRaphson(; verbose=1, tol=1e-7)NewtonRaphson(20, 1.0e-7, 1)

We reset the variables to their initial values:

julia> ExaPF.init!(polar_gpu, stack_gpu)

Then, solving the power flow with the iterative solvers directly translates to one call to nlsolve!:

julia> convergence = ExaPF.nlsolve!(pf_algo, jx_gpu, stack_gpu; linear_solver=linear_solver)#it 0: 1.15103e+02
+  * Time Jacobian (s) ........: 2.6903
+  * Time linear solver (s) ...: 0.3161
+  * Time total (s) ...........: 3.8605

Note that we get exactly the same iterations as when we solve the power flow equations on the CPU.

How to solve the linear system with BICGSTAB?

By default, the algorithm runs with a direct solver, which might be inefficient for large problems. To overcome this issue, ExaPF implements a wrapper for different iterative algorithms (GMRES, BICGSTAB).

The performance of iterative solvers is usually improved if we use a preconditioner. ExaPF.jl implements an overlapping Schwarz preconditioner, tailored for GPU usage. To build an instance with 8 blocks, just write

julia> npartitions = 8;
julia> jac_gpu = jx_gpu.J;
julia> precond = LS.BlockJacobiPreconditioner(jac_gpu, npartitions, CUDABackend());

You can attach the preconditioner to an BICGSTAB algorithm simply as

julia> linear_solver = ExaPF.KrylovBICGSTAB(jac_gpu; P=precond);

(this will use the BICGSTAB algorithm implemented in Krylov.jl).

We need to update accordingly the tolerance of the Newton-Raphson algorithm (the iterative solver is less accurate than the direct solver):

julia> pf_algo = NewtonRaphson(; verbose=1, tol=1e-7)NewtonRaphson(20, 1.0e-7, 1)

We reset the variables to their initial values:

julia> ExaPF.init!(polar_gpu, stack_gpu)

Then, solving the power flow with the iterative solvers directly translates to one call to nlsolve!:

julia> convergence = ExaPF.nlsolve!(pf_algo, jx_gpu, stack_gpu; linear_solver=linear_solver)#it 0: 1.15103e+02
 #it 1: 1.50328e+01
 #it 2: 5.88242e-01
 #it 3: 4.88493e-03
@@ -90,5 +90,5 @@
 Power flow has converged: true
   * #iterations: 5
   * Time Jacobian (s) ........: 0.0031
-  * Time linear solver (s) ...: 6.3513
-  * Time total (s) ...........: 6.3557
+ * Time linear solver (s) ...: 6.4427 + * Time total (s) ...........: 6.4472 diff --git a/dev/tutorials/batch_evaluation/index.html b/dev/tutorials/batch_evaluation/index.html index 7e92a575..f9cd33ab 100644 --- a/dev/tutorials/batch_evaluation/index.html +++ b/dev/tutorials/batch_evaluation/index.html @@ -21,26 +21,26 @@ #states : 14

Then, ExaPF can also instantiate a NetworkStack object, with the memory required to store the variables of the different scenarios:

blk_stack = ExaPF.NetworkStack(blk_polar)
210-elements NetworkStack{Vector{Float64}}

We can pass the scenarios manually using the function set_params!:

ploads = rand(nbus, nscen);
 qloads = rand(nbus, nscen);
 ExaPF.set_params!(blk_stack, ploads, qloads)
90-element Vector{Float64}:
- 0.8559617137259788
- 0.3523470433277963
- 0.6542128581116299
- 0.3935661696262659
- 0.9582549503601299
- 0.887634035398972
- 0.03380000409131956
- 0.5064801070009395
- 0.3930399440732132
- 0.9987776865280941
+ 0.20990923136693918
+ 0.10583995693791581
+ 0.8197256681673252
+ 0.39345359869856256
+ 0.7704591957163551
+ 0.1342760955270077
+ 0.5288658483743877
+ 0.5591417095755342
+ 0.7877044076153956
+ 0.7967985903854334
  ⋮
- 0.8592911762557289
- 0.7207032819592781
- 0.41789405160400883
- 0.7682046111725708
- 0.36016705254828374
- 0.9546987186525899
- 0.9421158379310158
- 0.7397453193432354
- 0.33591601602255794

The structure blk_stack stores $N$ different realizations for the variables stored in the field input (vmag, vang and pgen). By default, the initial values are set according to the values specified in blk_polar (usually defined when importing the data from the instance file):

reshape(blk_stack.vmag, nbus, nscen)
9×10 Matrix{Float64}:
+ 0.3549516985254003
+ 0.6077524315924532
+ 0.8020293655118549
+ 0.9567309317507097
+ 0.08454514905428334
+ 0.06256432108558496
+ 0.8276076647559975
+ 0.46542890652009605
+ 0.041861977376091764

The structure blk_stack stores $N$ different realizations for the variables stored in the field input (vmag, vang and pgen). By default, the initial values are set according to the values specified in blk_polar (usually defined when importing the data from the instance file):

reshape(blk_stack.vmag, nbus, nscen)
9×10 Matrix{Float64}:
  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
@@ -50,32 +50,32 @@
  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0

Only the parameters are varying according to the scenarios we passed as input in the constructor:

reshape(blk_stack.pload, nbus, nscen)
9×10 Matrix{Float64}:
- 0.155886  0.126517   0.81435    0.89088    …  0.731609  0.0511128  0.260856
- 0.324172  0.237615   0.0301523  0.634825      0.96086   0.264304   0.544441
- 0.101134  0.81371    0.71187    0.618062      0.503013  0.438363   0.605646
- 0.921298  0.686628   0.0307813  0.275111      0.113424  0.386383   0.244243
- 0.896607  0.391815   0.521849   0.0917468     0.105306  0.220718   0.994517
- 0.817869  0.215029   0.946439   0.877059   …  0.107703  0.113814   0.886688
- 0.774943  0.368762   0.520698   0.123852      0.472091  0.179356   0.338556
- 0.121014  0.0335929  0.709168   0.437472      0.314495  0.0391468  0.405176
- 0.816785  0.157054   0.738976   0.804088      0.634106  0.977827   0.251487

Evaluate expressions in block

ExaPF takes advantage of the block structure when using a BlockPolarForm.

As an example, suppose we want to evaluate the power flow balances in block form with a PowerFlowBalance expression:

powerflow = ExaPF.PowerFlowBalance(blk_polar) ∘ ExaPF.PolarBasis(blk_polar);
ExaPF.ComposedExpressions{ExaPF.PolarBasis{Vector{Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}}, ExaPF.PowerFlowBalance{Vector{Float64}, SparseArrays.SparseMatrixCSC{Float64, Int64}}}(PolarBasis (AbstractExpression), PowerFlowBalance (AbstractExpression))

A block evaluation takes as input the NetworkStack blk_stack structure:

m = div(length(powerflow), nscen);
+ 0.939174  0.106741  0.787668   0.636977    …  0.4948     0.12445    0.528394
+ 0.316864  0.51871   0.615111   0.156119       0.893431   0.801993   0.586456
+ 0.31354   0.778947  0.65235    0.402863       0.171962   0.0105805  0.161368
+ 0.933676  0.276877  0.947966   0.156013       0.214261   0.923071   0.743839
+ 0.94346   0.211063  0.335039   0.872171       0.731585   0.585668   0.679079
+ 0.950353  0.431065  0.0688582  0.618369    …  0.0854865  0.917537   0.308666
+ 0.752458  0.934262  0.352584   0.43277        0.38271    0.155886   0.920511
+ 0.745407  0.522532  0.327495   0.405472       0.395415   0.332575   0.03708
+ 0.799449  0.966297  0.0152368  0.00462364     0.886784   0.382194   0.87722

Evaluate expressions in block

ExaPF takes advantage of the block structure when using a BlockPolarForm.

As an example, suppose we want to evaluate the power flow balances in block form with a PowerFlowBalance expression:

powerflow = ExaPF.PowerFlowBalance(blk_polar) ∘ ExaPF.PolarBasis(blk_polar);
ExaPF.ComposedExpressions{ExaPF.PolarBasis{Vector{Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}}, ExaPF.PowerFlowBalance{Vector{Float64}, SparseArrays.SparseMatrixCSC{Float64, Int64}}}(PolarBasis (AbstractExpression), PowerFlowBalance (AbstractExpression))

A block evaluation takes as input the NetworkStack blk_stack structure:

m = div(length(powerflow), nscen);
 blk_output = zeros(m * nscen);
 powerflow(blk_output, blk_stack); # inplace evaluation
 reshape(blk_output, m, nscen)
14×10 Matrix{Float64}:
- -1.30583   -1.39239    -1.59985    …  -0.66914    -1.3657     -1.08556
- -0.748866  -0.0362896  -0.13813       -0.346987   -0.411637   -0.244354
-  0.921298   0.686628    0.0307813      0.113424    0.386383    0.244243
-  0.896607   0.391815    0.521849       0.105306    0.220718    0.994517
-  0.817869   0.215029    0.946439       0.107703    0.113814    0.886688
-  0.774943   0.368762    0.520698   …   0.472091    0.179356    0.338556
-  0.121014   0.0335929   0.709168       0.314495    0.0391468   0.405176
-  0.816785   0.157054    0.738976       0.634106    0.977827    0.251487
-  0.226566  -0.0438189   0.661271       0.396779    0.150221    0.601205
-  0.700255   0.023036   -0.241488       0.555347    0.291852    0.102167
-  0.604134   0.211697    0.671528   …  -0.219698    0.386255    0.671199
- -0.1452     0.513068    0.150179      -0.0637118   0.0719483   0.763116
-  0.27898   -0.187753    0.457293       0.470065    0.273789    0.512245
-  0.15204    0.399831   -0.212529       0.501052    0.0226803   0.094916

We get $N$ different results for the power flow balance equations, depending on which scenario we are on.

Solve power flow in block on the CPU

Once the different structures used for block evaluation instantiated, one is able to solve the power flow in block on the CPU using the same function nlsolve!. The block Jacobian is evaluated with automatic differentiation using a ArrowheadJacobian structure:

blk_jx = ExaPF.ArrowheadJacobian(blk_polar, powerflow, State());
+ -1.31314   -1.11129    -1.01489    …  -0.736569   -0.828007  -1.04354
+ -0.53646   -0.0710531  -0.19765       -0.678038   -0.83942   -0.688632
+  0.933676   0.276877    0.947966       0.214261    0.923071   0.743839
+  0.94346    0.211063    0.335039       0.731585    0.585668   0.679079
+  0.950353   0.431065    0.0688582      0.0854865   0.917537   0.308666
+  0.752458   0.934262    0.352584   …   0.38271     0.155886   0.920511
+  0.745407   0.522532    0.327495       0.395415    0.332575   0.03708
+  0.799449   0.966297    0.0152368      0.886784    0.382194   0.87722
+  0.226454   0.491846    0.429811       0.674049    0.350689   0.789731
+  0.512459   0.0566922  -0.121866       0.28733    -0.233151  -0.173455
+ -0.149224  -0.0816726   0.560234   …   0.0297031   0.117805  -0.220936
+  0.349866   0.702196    0.696073       0.817893    0.136542   0.648608
+  0.331642  -0.0815144   0.0624883      0.140147    0.38801    0.237929
+  0.546704   0.609623   -0.0845229      0.218835    0.501485  -0.199138

We get $N$ different results for the power flow balance equations, depending on which scenario we are on.

Solve power flow in block on the CPU

Once the different structures used for block evaluation instantiated, one is able to solve the power flow in block on the CPU using the same function nlsolve!. The block Jacobian is evaluated with automatic differentiation using a ArrowheadJacobian structure:

blk_jx = ExaPF.ArrowheadJacobian(blk_polar, powerflow, State());
 blk_jx.J
140×140 SparseArrays.SparseMatrixCSC{Float64, Int64} with 820 stored entries:
 ⎡⡱⣮⡲⣞⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
 ⎢⣸⢮⣻⣾⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
@@ -122,20 +122,20 @@
     blk_jx,
     blk_stack;
 )
Power flow has converged: true
-  * #iterations: 4
+  * #iterations: 5
   * Time Jacobian (s) ........: 0.0002
-  * Time linear solver (s) ...: 0.0004
-  * Time total (s) ...........: 0.0008
+  * Time linear solver (s) ...: 0.0005
+  * Time total (s) ...........: 0.0009
 

At the solution, we get different values for the voltage magnitudes at the PQ nodes:

reshape(blk_stack.vmag, nbus, nscen)
9×10 Matrix{Float64}:
  1.0       1.0       1.0       1.0       …  1.0       1.0       1.0
  1.0       1.0       1.0       1.0          1.0       1.0       1.0
  1.0       1.0       1.0       1.0          1.0       1.0       1.0
- 0.938158  0.974327  0.966341  0.955018     0.939675  0.962022  0.936623
- 0.874072  0.964998  0.964821  0.906585     0.914236  0.937921  0.903355
- 0.942826  0.970611  0.948198  0.946349  …  0.982096  0.967596  0.918776
- 0.955353  0.948228  0.94191   0.952082     0.968001  0.965162  0.880006
- 0.963892  0.978364  0.958151  0.975502     0.958567  0.971792  0.924435
- 0.927653  0.944859  0.96838   0.934585     0.906917  0.946672  0.922845

Solve power flow in batch on the GPU

When the BlockPolarForm model is instantiated on the GPU, the expressions are evaluated in batch. The syntax to solve the power flow equations is exactly the same as on the CPU, using cusolverRF to solve the different linear systems:

using CUDA, CUSOLVERRF
+ 0.919031  0.93642   0.970286  0.955937     0.933059  0.962416  0.960954
+ 0.87273   0.933342  0.966321  0.951431     0.908999  0.973919  0.969266
+ 0.953467  0.959399  0.953967  0.988636  …  0.957233  0.978984  0.983207
+ 0.921191  0.913655  0.926085  0.966916     0.90954   0.961553  0.93641
+ 0.93667   0.949071  0.965978  0.983309     0.946525  0.962172  0.964383
+ 0.878165  0.888979  0.971024  0.959956     0.91289   0.926465  0.963843

Solve power flow in batch on the GPU

When the BlockPolarForm model is instantiated on the GPU, the expressions are evaluated in batch. The syntax to solve the power flow equations is exactly the same as on the CPU, using cusolverRF to solve the different linear systems:

using CUDA, CUSOLVERRF
 polar_gpu = ExaPF.load_polar("case9.m", CUDABackend());
 blk_polar_gpu = ExaPF.BlockPolarForm(polar_gpu, nscen); # load model on GPU
 blk_stack_gpu = ExaPF.NetworkStack(blk_polar_gpu);
@@ -152,8 +152,8 @@
     blk_stack_gpu;
     linear_solver=rf_solver,
 )
Power flow has converged: true
-  * #iterations: 4
-  * Time Jacobian (s) ........: 0.0013
-  * Time linear solver (s) ...: 0.0579
-  * Time total (s) ...........: 0.0603
-
+ * #iterations: 5 + * Time Jacobian (s) ........: 0.0015 + * Time linear solver (s) ...: 0.0606 + * Time total (s) ...........: 0.0634 + diff --git a/dev/tutorials/direct_solver/index.html b/dev/tutorials/direct_solver/index.html index fc908ebc..a33a8667 100644 --- a/dev/tutorials/direct_solver/index.html +++ b/dev/tutorials/direct_solver/index.html @@ -15,16 +15,16 @@ jx = ExaPF.Jacobian(polar, func, State()) # init AD ExaPF.nlsolve!(pf_solver, jx, stack)
Power flow has converged: true
   * #iterations: 7
-  * Time Jacobian (s) ........: 1.1632
-  * Time linear solver (s) ...: 0.3274
-  * Time total (s) ...........: 1.4957
+  * Time Jacobian (s) ........: 1.1969
+  * Time linear solver (s) ...: 0.3460
+  * Time total (s) ...........: 1.5480
 

KLU (CPU)

KLU is an efficient sparse linear solver, initially designed for circuit simulation problems. It is often considered as one of the state-of-the-art linear solver to solve power flow problems. Conveniently, KLU is wrapped in Julia with the package KLU.jl. KLU.jl implements a proper interface to use KLU. We just have to implement a forgiving function for LinearAlgebra.lu!

LinearAlgebra.lu!(K::KLU.KLUFactorization, J) = KLU.klu!(K, J)

Then, we are ready to solve a power flow with KLU using our current abstraction. One has just to create a new instance of LS.DirectSolver:

klu_factorization = KLU.klu(jx.J)
-klu_solver = LS.DirectSolver(klu_factorization)
ExaPF.LinearSolvers.DirectSolver{KLU.KLUFactorization{Float64, Int64}}(KLU.KLUFactorization{Float64, Int64}(KLU.LibKLU.klu_l_common_struct(0.001, 1.2, 1.2, 10.0, 0.0, 1, 0, 2, Ptr{Nothing} @0x0000000000000000, Ptr{Nothing} @0x0000000000000000, 1, 0, 0, 17036, 17036, 17036, 0, -1.0, -1.0, -1.0, -1.0, 0.0, 0x00000000005a0030, 0x0000000000760150), Ptr{Nothing} @0x000000004bb4b390, Ptr{Nothing} @0x000000004ca83420, 17036, [0, 4, 12, 19, 24, 27, 37, 40, 44, 54  …  129350, 129359, 129363, 129369, 129380, 129384, 129390, 129402, 129408, 129412], [0, 383, 5548, 13344, 1, 458, 4027, 5306, 6117, 11823  …  6009, 6175, 9238, 13805, 13971, 17034, 8879, 9239, 16675, 17035], [1050.4551624894714, -629.5373899630948, -421.251258755022, 88.40062041224645, 995.2128749049234, -479.4136240982223, -138.29482949968113, -145.959631463846, -230.919755492703, 22.972728768573532  …  -480.4784394426574, -39.711370520688675, 520.164296001656, -3360.900457400469, -208.11090119999253, 3568.9822586698524, -39.40403545216682, 39.40522422897138, -242.80215386396614, 242.80196093598838]))

and pass it to nlsolve!:

ExaPF.init!(polar, stack) # reinit stack
+klu_solver = LS.DirectSolver(klu_factorization)
ExaPF.LinearSolvers.DirectSolver{KLU.KLUFactorization{Float64, Int64}}(KLU.KLUFactorization{Float64, Int64}(KLU.LibKLU.klu_l_common_struct(0.001, 1.2, 1.2, 10.0, 0.0, 1, 0, 2, Ptr{Nothing} @0x0000000000000000, Ptr{Nothing} @0x0000000000000000, 1, 0, 0, 17036, 17036, 17036, 0, -1.0, -1.0, -1.0, -1.0, 0.0, 0x00000000005a0030, 0x0000000000760150), Ptr{Nothing} @0x000000004aecdac0, Ptr{Nothing} @0x000000004f529980, 17036, [0, 4, 12, 19, 24, 27, 37, 40, 44, 54  …  129350, 129359, 129363, 129369, 129380, 129384, 129390, 129402, 129408, 129412], [0, 383, 5548, 13344, 1, 458, 4027, 5306, 6117, 11823  …  6009, 6175, 9238, 13805, 13971, 17034, 8879, 9239, 16675, 17035], [1050.4551624894714, -629.5373899630948, -421.251258755022, 88.40062041224645, 995.2128749049234, -479.4136240982223, -138.29482949968113, -145.959631463846, -230.919755492703, 22.972728768573532  …  -480.4784394426574, -39.711370520688675, 520.164296001656, -3360.900457400469, -208.11090119999253, 3568.9822586698524, -39.40403545216682, 39.40522422897138, -242.80215386396614, 242.80196093598838]))

and pass it to nlsolve!:

ExaPF.init!(polar, stack) # reinit stack
 ExaPF.nlsolve!(pf_solver, jx, stack; linear_solver=klu_solver)
Power flow has converged: true
   * #iterations: 7
-  * Time Jacobian (s) ........: 0.2816
-  * Time linear solver (s) ...: 0.0336
-  * Time total (s) ...........: 0.3202
+  * Time Jacobian (s) ........: 0.2822
+  * Time linear solver (s) ...: 0.0337
+  * Time total (s) ...........: 0.3209
 

We observe KLU reduces considerably the time spent in the linear solver.

cusolverRF (CUDA)

cusolverRF is an efficient LU refactorization routine implemented in CUDA. It is wrapped in Julia inside the package CUSOLVERRF.jl:

using CUSOLVERRF

The principle is the following: the initial symbolic factorization is computed on the CPU with the routine chosen by the user. Then, each time we have to refactorize a matrix with the same sparsity pattern, we can recompute the numerical factorization entirely on the GPU. In practice, this solver is efficient at refactorizing a given matrix if the sparsity is significant.

This is of direct relevance for us, as (i) the sparsity of the power flow Jacobian doesn't change along the Newton iterations and (ii) the Jacobian is super-sparse. In ExaPF, it is the linear solver of choice when it comes to solve the power flow entirely on the GPU.

CUSOLVERRF.jl follows the LinearAlgebra's interface, so we can use it directly in ExaPF. We first have to instantiate everything on the GPU:

using CUDA
 polar_gpu = ExaPF.load_polar("case9241pegase.m", CUDABackend())
 stack_gpu = ExaPF.NetworkStack(polar_gpu)
@@ -34,7 +34,7 @@
 rf_solver = LS.DirectSolver(rf_fac)
ExaPF.LinearSolvers.DirectSolver{CUSOLVERRF.RFLU{Float64, CUSOLVERRF.CuSparseSVDeprecated, CUSOLVERRF.CuSparseSMDeprecated}}(CUSOLVERRF.RFLU{Float64, CUSOLVERRF.CuSparseSVDeprecated, CUSOLVERRF.CuSparseSMDeprecated}
 )

Then, we are able to solve the power flow entirely on the GPU, simply as

ExaPF.nlsolve!(pf_solver, jx_gpu, stack_gpu; linear_solver=rf_solver)
Power flow has converged: true
   * #iterations: 7
-  * Time Jacobian (s) ........: 1.4186
-  * Time linear solver (s) ...: 0.0851
-  * Time total (s) ...........: 1.5054
-
+ * Time Jacobian (s) ........: 1.4249 + * Time linear solver (s) ...: 0.0827 + * Time total (s) ...........: 1.5092 +