diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index a19c0704..85498797 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.9.4","generation_timestamp":"2023-12-05T12:53:48","documenter_version":"1.2.1"}} \ No newline at end of file +{"documenter":{"julia_version":"1.9.4","generation_timestamp":"2023-12-07T01:28:33","documenter_version":"1.2.1"}} \ No newline at end of file diff --git a/dev/artifact/index.html b/dev/artifact/index.html index d0ec891b..b18944d9 100644 --- a/dev/artifact/index.html +++ b/dev/artifact/index.html @@ -1,2 +1,2 @@ -ExaData Artifact · ExaPF.jl

ExaData Artifact

The ExaData artifact contains test cases relevant to the Exascale Computing Project. It is built from the git repository available at ExaData. Apart from the standard MATPOWER files it additionally contains demand scenarios and contingencies used in multiperiod security constrained optimal power flow settings.

+ExaData Artifact · ExaPF.jl

ExaData Artifact

The ExaData artifact contains test cases relevant to the Exascale Computing Project. It is built from the git repository available at ExaData. Apart from the standard MATPOWER files it additionally contains demand scenarios and contingencies used in multiperiod security constrained optimal power flow settings.

diff --git a/dev/index.html b/dev/index.html index 76fcb0f8..61b7783b 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -Home · ExaPF.jl

ExaPF

ExaPF.jl is a package to solve the power flow problem on upcoming exascale architectures. The code has been designed to be:

  1. Portable: Targeting exascale architectures implies a focus on graphics processing units (GPUs) as these systems lack substantial computational performance through classical CPUs.
  2. Differentiable: All the expressions implemented in ExaPF are fully compatible with ForwardDiff.jl, and routines are provided to extract first- and second-order derivatives to solve efficiently power flow and optimal power flow problems.

ExaPF implements a vectorized modeler for power systems, which allows to manipulate basic expressions. All expressions are fully differentiable : their first and second-order derivatives can be extracted efficiently using automatic differentiation. In addition, we provide extensions that leverage the packages CUDA.jl, [AMDGPU.jl]((https://github.com/JuliaGPU/AMDGPU.jl), and KernelAbstractions.jl to make ExaPF portable across GPU architectures.

Table of contents

Manual

Library

Artifact

Funding

This research was supported by the Exascale Computing Project (17-SC-20-SC), a joint project of the U.S. Department of Energy’s Office of Science and National Nuclear Security Administration, responsible for delivering a capable exascale ecosystem, including software, applications, and hardware technology, to support the nation’s exascale computing imperative.

+Home · ExaPF.jl

ExaPF

ExaPF.jl is a package to solve the power flow problem on upcoming exascale architectures. The code has been designed to be:

  1. Portable: Targeting exascale architectures implies a focus on graphics processing units (GPUs) as these systems lack substantial computational performance through classical CPUs.
  2. Differentiable: All the expressions implemented in ExaPF are fully compatible with ForwardDiff.jl, and routines are provided to extract first- and second-order derivatives to solve efficiently power flow and optimal power flow problems.

ExaPF implements a vectorized modeler for power systems, which allows to manipulate basic expressions. All expressions are fully differentiable : their first and second-order derivatives can be extracted efficiently using automatic differentiation. In addition, we provide extensions that leverage the packages CUDA.jl, [AMDGPU.jl]((https://github.com/JuliaGPU/AMDGPU.jl), and KernelAbstractions.jl to make ExaPF portable across GPU architectures.

Table of contents

Manual

Library

Artifact

Funding

This research was supported by the Exascale Computing Project (17-SC-20-SC), a joint project of the U.S. Department of Energy’s Office of Science and National Nuclear Security Administration, responsible for delivering a capable exascale ecosystem, including software, applications, and hardware technology, to support the nation’s exascale computing imperative.

diff --git a/dev/lib/autodiff/index.html b/dev/lib/autodiff/index.html index 4c54777f..6b762e4e 100644 --- a/dev/lib/autodiff/index.html +++ b/dev/lib/autodiff/index.html @@ -1,13 +1,13 @@ -AutoDiff · ExaPF.jl

AutoDiff

Variables

Expressions

ExaPF.AutoDiff.AbstractExpressionType
AbstractExpression

Abstract type for differentiable function $f(x)$. Any AbstractExpression implements two functions: a forward mode to evaluate $f(x)$, and an adjoint to evaluate $∂f(x)$.

Forward mode

The direct evaluation of the function $f$ is implemented as

(expr::AbstractExpression)(output::VT, stack::AbstractStack{VT}) where VT<:AbstractArray
+AutoDiff · ExaPF.jl

AutoDiff

Variables

Expressions

ExaPF.AutoDiff.AbstractExpressionType
AbstractExpression

Abstract type for differentiable function $f(x)$. Any AbstractExpression implements two functions: a forward mode to evaluate $f(x)$, and an adjoint to evaluate $∂f(x)$.

Forward mode

The direct evaluation of the function $f$ is implemented as

(expr::AbstractExpression)(output::VT, stack::AbstractStack{VT}) where VT<:AbstractArray
 

the input being specified in stack, the results being stored in the array output.

Reverse mode

The adjoint of the function is specified by the function adjoint!, with the signature:

adjoint!(expr::AbstractExpression, ∂stack::AbstractStack{VT}, stack::AbstractStack{VT}, ̄v::VT) where VT<:AbstractArray
-

The variable stack stores the result of the direct evaluation, and is not modified in adjoint!. The results are stored inside the adjoint stack ∂stack.

source
ExaPF.AutoDiff.adjoint!Function
adjoint!(expr::AbstractExpression, ∂stack::AbstractStack{VT}, stack::AbstractStack{VT}, ̄v::VT) where VT<:AbstractArray

Compute the adjoint of the AbstractExpression expr with relation to the adjoint vector ̄v. The results are stored in the adjoint stack ∂stack. The variable stack stores the result of a previous direct evaluation, and is not modified in adjoint!.

source

First and second-order derivatives

Utils

ExaPF.AutoDiff.seed!Function
seed!(
+

The variable stack stores the result of the direct evaluation, and is not modified in adjoint!. The results are stored inside the adjoint stack ∂stack.

source
ExaPF.AutoDiff.adjoint!Function
adjoint!(expr::AbstractExpression, ∂stack::AbstractStack{VT}, stack::AbstractStack{VT}, ̄v::VT) where VT<:AbstractArray

Compute the adjoint of the AbstractExpression expr with relation to the adjoint vector ̄v. The results are stored in the adjoint stack ∂stack. The variable stack stores the result of a previous direct evaluation, and is not modified in adjoint!.

source

First and second-order derivatives

Utils

ExaPF.AutoDiff.seed!Function
seed!(
     H::AbstractHessianProd,
     v::AbstractVector{T},
-) where {T}

Seed the duals with v to compute the Hessian vector product $λ^⊤ H v$.

source
ExaPF.AutoDiff.seed_coloring!Function
seed_coloring!(
     M::Union{AbstractJacobian, AbstractFullHessian}
     coloring::AbstractVector,
-)

Seed the duals with the coloring based seeds to compute the Jacobian or Hessian $M$.

source
ExaPF.AutoDiff.partials!Function
partials!(jac::AbstractJacobian)

Extract partials from Jacobian jac in jac.J.

source
partials!(hess::AbstractFullHessian)

Extract partials from Hessian hess into hess.H.

source
ExaPF.AutoDiff.partials!Function
partials!(jac::AbstractJacobian)

Extract partials from Jacobian jac in jac.J.

source
partials!(hess::AbstractFullHessian)

Extract partials from Hessian hess into hess.H.

source
ExaPF.AutoDiff.set_value!Function
set_value!(
     jac,
     primals::AbstractVector{T}
-) where {T}

Set values of ForwardDiff.Dual numbers in jac to primals.

source
+) where {T}

Set values of ForwardDiff.Dual numbers in jac to primals.

source
diff --git a/dev/lib/formulations/index.html b/dev/lib/formulations/index.html index f8061aae..e5519df4 100644 --- a/dev/lib/formulations/index.html +++ b/dev/lib/formulations/index.html @@ -1,5 +1,5 @@ -Polar formulation · ExaPF.jl

Polar formulation

Generic templates

ExaPF.StateType
State <: AbstractVariable

All variables $x$ depending on the variables Control $u$ through the non-linear equation $g(x, u) = 0$.

source
ExaPF.ControlType
Control <: AbstractVariable

Independent variables $u$ used in the reduced-space formulation.

source

Structure and variables

ExaPF.PolarFormType
PolarForm{T, IT, VT, MT} <: AbstractPolarFormulation

Implement the polar formulation associated to the network's equations.

Wrap a PS.PowerNetwork network to load the data on the target device (CPU() and CUDABackend() are currently supported).

Example

julia> const PS = ExaPF.PowerSystem;
+Polar formulation · ExaPF.jl

Polar formulation

Generic templates

ExaPF.StateType
State <: AbstractVariable

All variables $x$ depending on the variables Control $u$ through the non-linear equation $g(x, u) = 0$.

source
ExaPF.ControlType
Control <: AbstractVariable

Independent variables $u$ used in the reduced-space formulation.

source

Structure and variables

ExaPF.PolarFormType
PolarForm{T, IT, VT, MT} <: AbstractPolarFormulation

Implement the polar formulation associated to the network's equations.

Wrap a PS.PowerNetwork network to load the data on the target device (CPU() and CUDABackend() are currently supported).

Example

julia> const PS = ExaPF.PowerSystem;
 
 julia> network_data = PS.load_case("case9.m");
 
@@ -12,7 +12,7 @@
 giving a mathematical formulation with:
     #controls:   5
     #states  :   14
-
source
ExaPF.BlockPolarFormType
BlockPolarForm{T, IT, VT, MT} <: AbstractFormulation

Block polar formulation: duplicates k different polar models to evaluate them in parallel.

source
ExaPF.load_polarFunction
load_polar(case, device=CPU(); dir=PS.EXADATA)

Load a PolarForm instance from the specified benchmark library dir on the target device (default is CPU). ExaPF uses two different benchmark libraries: MATPOWER (dir=EXADATA) and PGLIB-OPF (dir=PGLIB).

Examples

julia> polar = ExaPF.load_polar("case9")
+
source
ExaPF.BlockPolarFormType
BlockPolarForm{T, IT, VT, MT} <: AbstractFormulation

Block polar formulation: duplicates k different polar models to evaluate them in parallel.

source
ExaPF.load_polarFunction
load_polar(case, device=CPU(); dir=PS.EXADATA)

Load a PolarForm instance from the specified benchmark library dir on the target device (default is CPU). ExaPF uses two different benchmark libraries: MATPOWER (dir=EXADATA) and PGLIB-OPF (dir=PGLIB).

Examples

julia> polar = ExaPF.load_polar("case9")
 Polar formulation (instantiated on device CPU(false))
 Network characteristics:
     #buses:      9  (#slack: 1  #PV: 2  #PQ: 6)
@@ -21,7 +21,7 @@
 giving a mathematical formulation with:
     #controls:   5
     #states  :   14
-
source
ExaPF.NetworkStackType
NetworkStack{VT,VD,MT} <: AbstractNetworkStack{VT}
 NetworkStack(polar::PolarForm)
 NetworkStack(nbus::Int, ngen::Int, nlines::Int, VT::Type)

Store the variables associated to the polar formulation. The variables are stored in the field input, ordered as follows

    input = [vmag ; vang ; pgen]

The object stores also intermediate variables needed in the expression tree, such as the LKMR basis ψ.

Notes

The NetworkStack can be instantiated on the host or on the target device.

Examples

julia> polar = ExaPF.load_polar("case9");
 
@@ -39,7 +39,7 @@
  1.0
  1.0
  1.0
-
source

The state and the control are defined as mapping:

ExaPF.mappingFunction
mapping(polar::PolarForm, ::Control)

Return the mapping associated to the Control() in NetworkStack according to the polar formulation PolarForm.

Examples

julia> polar = ExaPF.load_polar("case9");
+
source

The state and the control are defined as mapping:

ExaPF.mappingFunction
mapping(polar::PolarForm, ::Control)

Return the mapping associated to the Control() in NetworkStack according to the polar formulation PolarForm.

Examples

julia> polar = ExaPF.load_polar("case9");
 
 julia> mapu = ExaPF.mapping(polar, Control())
 5-element Vector{Int64}:
@@ -48,7 +48,7 @@
   3
  20
  21
-
source
mapping(polar::PolarForm, ::State)

Return the mapping associated to the State() in NetworkStack according to the polar formulation PolarForm.

Examples

julia> polar = ExaPF.load_polar("case9");
+
source
mapping(polar::PolarForm, ::State)

Return the mapping associated to the State() in NetworkStack according to the polar formulation PolarForm.

Examples

julia> polar = ExaPF.load_polar("case9");
 
 julia> mapu = ExaPF.mapping(polar, State())
 14-element Vector{Int64}:
@@ -66,7 +66,7 @@
   7
   8
   9
-
source

Powerflow solver

Powerflow solver

ExaPF.run_pfFunction
run_pf(
     polar::PolarForm, stack::NetworkStack;
     rtol=1e-8, max_iter=20, verbose=0,
 )

Solve the power flow equations $g(x, u) = 0$ w.r.t. the stack $x$, using the (NewtonRaphson algorithm. The initial state $x$ is specified implicitly inside stack, with the mapping mapping associated to the polar formulation. The object stack is modified inplace in the function.

The algorithm stops when a tolerance rtol or a maximum number of iterations maxiter is reached.

Arguments

  • polar::AbstractFormulation: formulation of the power flow equation
  • stack::NetworkStack: initial values in the network

Examples

julia> polar = ExaPF.load_polar("case9");
@@ -79,7 +79,7 @@
 true
 
 julia> conv.n_iterations
-4
source
ExaPF.nlsolve!Function
nlsolve!(
     algo::NewtonRaphson,
     jac::Jacobian,
     stack::NetworkStack;
@@ -100,7 +100,7 @@
 true
 
 julia> conv.n_iterations
-4
source
ExaPF.NewtonRaphsonType
NewtonRaphson <: AbstractNonLinearSolver

Newton-Raphson algorithm.

Attributes

  • maxiter::Int (default 20): maximum number of iterations
  • tol::Float64 (default 1e-8): tolerance of the algorithm
  • verbose::Int (default 0): verbosity level
source

Constraints

The different parts of the polar formulation are implemented in the following AbstractExpression:

ExaPF.NewtonRaphsonType
NewtonRaphson <: AbstractNonLinearSolver

Newton-Raphson algorithm.

Attributes

  • maxiter::Int (default 20): maximum number of iterations
  • tol::Float64 (default 1e-8): tolerance of the algorithm
  • verbose::Int (default 0): verbosity level
source

Constraints

The different parts of the polar formulation are implemented in the following AbstractExpression:

ExaPF.PolarBasisType
PolarBasis{VI, MT} <: AbstractExpression
 PolarBasis(polar::AbstractPolarFormulation)

Implement the LKMR nonlinear basis. Takes as input the voltage magnitudes vmag and the voltage angles vang and returns

\[ \begin{aligned} & \psi_\ell^C(v, \theta) = v^f v^t \cos(\theta_f - \theta_t) \quad \forall \ell = 1, \cdots, n_\ell \\ & \psi_\ell^S(v, \theta) = v^f v^t \sin(\theta_f - \theta_t) \quad \forall \ell = 1, \cdots, n_\ell \\ @@ -134,7 +134,7 @@ 1.0 1.0 1.0 -

source
ExaPF.PowerFlowBalanceType
PowerFlowBalance{VT, MT}
 PowerFlowBalance(polar)

Implement a subset of the power injection corresponding to $(p_{inj}^{pv}, p_{inj}^{pq}, q_{inj}^{pq})$. The function encodes the active balance equations at PV and PQ nodes, and the reactive balance equations at PQ nodes:

\[\begin{aligned} p_i &= v_i \sum_{j}^{n} v_j (g_{ij}\cos{(\theta_i - \theta_j)} + b_{ij}\sin{(\theta_i - \theta_j})) \,, & ∀ i ∈ \{PV, PQ\} \\ @@ -167,7 +167,7 @@ julia> isapprox(powerflow(stack), zeros(14); atol=1e-8) true -

source
ExaPF.VoltageMagnitudeBoundsType
VoltageMagnitudeBounds

Implement the bounds on voltage magnitudes not taken into account in the bound constraints. In the reduced space, this is associated to the the voltage magnitudes at PQ nodes:

\[v_{pq}^♭ ≤ v_{pq} ≤ v_{pq}^♯ .\]

Dimension: n_pq

Complexity

1 copyto

Note

In the reduced space, the constraints on the voltage magnitudes at PV nodes $v_{pv}$ are taken into account when bounding the control $u$.

Examples

julia> polar = ExaPF.load_polar("case9");
+
source
ExaPF.VoltageMagnitudeBoundsType
VoltageMagnitudeBounds

Implement the bounds on voltage magnitudes not taken into account in the bound constraints. In the reduced space, this is associated to the the voltage magnitudes at PQ nodes:

\[v_{pq}^♭ ≤ v_{pq} ≤ v_{pq}^♯ .\]

Dimension: n_pq

Complexity

1 copyto

Note

In the reduced space, the constraints on the voltage magnitudes at PV nodes $v_{pv}$ are taken into account when bounding the control $u$.

Examples

julia> polar = ExaPF.load_polar("case9");
 
 julia> stack = ExaPF.NetworkStack(polar);
 
@@ -181,7 +181,7 @@
  1.0
  1.0
  1.0
-
source
ExaPF.PowerGenerationBoundsType
PowerGenerationBounds{VT, MT}
 PowerGenerationBounds(polar)

Constraints on the active power productions and on the reactive power productions that are not already taken into account in the bound constraints. In the reduced space, that amounts to

\[p_{g,ref}^♭ ≤ p_{g,ref} ≤ p_{g,ref}^♯ ; C_g q_g^♭ ≤ C_g q_g ≤ C_g q_g^♯ .\]

Require composition with PolarBasis.

Dimension: n_pv + 2 n_ref

Complexity

1 copyto, 1 SpMV

Examples

julia> polar = ExaPF.load_polar("case9");
 
@@ -197,7 +197,7 @@
   0.24069
   0.144601
  -0.03649
-
source
ExaPF.LineFlowsType
LineFlows{VT, MT}
 LineFlows(polar)

Implement thermal limit constraints on the lines of the network.

Require composition with PolarBasis.

Dimension: 2 * n_lines

Complexity

4 SpMV, 4 * n_lines quadratic, 2 * n_lines add

Examples

julia> polar = ExaPF.load_polar("case9");
 
 julia> stack = ExaPF.NetworkStack(polar);
@@ -226,7 +226,7 @@
  2.67781
  0.726668
  0.215497
-
source

Objective

The production costs is given in the AbstractExpression CostFunction:

Objective

The production costs is given in the AbstractExpression CostFunction:

ExaPF.CostFunctionType
CostFunction{VT, MT} <: AutoDiff.AbstractExpression
 CostFunction(polar)

Implement the quadratic cost function for OPF

\[ ∑_{g=1}^{n_g} c_{2,g} p_g^2 + c_{1,g} p_g + c_{0,g}\]

Require composition with PolarBasis to evaluate the cost of the reference generator.

Dimension: 1

Complexity

1 SpMV, 1 sum

Examples

julia> polar = ExaPF.load_polar("case9");
 
 julia> stack = ExaPF.NetworkStack(polar);
@@ -236,5 +236,5 @@
 julia> cost(stack)
 1-element Vector{Float64}:
  4509.0275
-
source

Composition of expressions

The different expressions can be combined together in several different ways.

ExaPF.MultiExpressionsType
MultiExpressions <: AbstractExpression

Implement expressions concatenation. Takes as input a vector of expressions [expr1,...,exprN] and concatenate them in a single expression mexpr, such that

    mexpr(x) = [expr1(x) ; expr2(x) ; ... ; exprN(x)]
-
source
ExaPF.ComposedExpressionsType
ComposedExpressions{Expr1<:PolarBasis, Expr2} <: AbstractExpression

Implement expression composition. Takes as input two expressions expr1 and expr2 and returns a composed expression cexpr such that ``` cexpr(x) = expr2 ∘ expr1(x)

Notes

Currently, only PolarBasis is supported for expr1.

source
+source

Composition of expressions

The different expressions can be combined together in several different ways.

ExaPF.MultiExpressionsType
MultiExpressions <: AbstractExpression

Implement expressions concatenation. Takes as input a vector of expressions [expr1,...,exprN] and concatenate them in a single expression mexpr, such that

    mexpr(x) = [expr1(x) ; expr2(x) ; ... ; exprN(x)]
+
source
ExaPF.ComposedExpressionsType
ComposedExpressions{Expr1<:PolarBasis, Expr2} <: AbstractExpression

Implement expression composition. Takes as input two expressions expr1 and expr2 and returns a composed expression cexpr such that ``` cexpr(x) = expr2 ∘ expr1(x)

Notes

Currently, only PolarBasis is supported for expr1.

source
diff --git a/dev/lib/linearsolver/index.html b/dev/lib/linearsolver/index.html index 456d6b6d..2a304a37 100644 --- a/dev/lib/linearsolver/index.html +++ b/dev/lib/linearsolver/index.html @@ -1,10 +1,10 @@ Linear Solvers · ExaPF.jl

Linear solvers

Description

ExaPF allows to solve linear systems with either direct and indirect linear algebra, both on CPU and on GPU. To solve a linear system $Ax = b$, ExaPF uses the function ldiv!.

ExaPF.LinearSolvers.ldiv!Function
ldiv!(solver, x, A, y)
-ldiv!(solver, x, y)
  • solver::AbstractLinearSolver: linear solver to solve the system
  • x::AbstractVector: Solution
  • A::AbstractMatrix: Input matrix
  • y::AbstractVector: RHS

Solve the linear system $A x = y$ using the algorithm specified in solver. If A is not specified, the function will used directly the factorization stored inside solver.

source

Direct solvers

ExaPF wraps UMFPACK (shipped with SuiteSparse.jl) on the CPU, and CUSPARSE on CUDA device.

ExaPF.LinearSolvers.DirectSolverType
DirectSolver <: AbstractLinearSolver

Solve linear system $A x = y$ with direct linear algebra.

  • On the CPU, DirectSolver uses UMFPACK to solve the linear system
  • On CUDA GPU, DirectSolver redirects the resolution to the method CUSOLVER.csrlsvqr
source

Iterative solvers

ExaPF.LinearSolvers.BicgstabType
Bicgstab <: AbstractIterativeLinearSolver
-Bicgstab(precond; verbose=0, rtol=1e-10, atol=1e-10)

Wrap Krylov.jl's BICGSTAB algorithm to solve iteratively the linear system $A x = y$.

source
ExaPF.LinearSolvers.DqgmresType
Dqgmres <: AbstractIterativeLinearSolver
-Dqgmres(precond; verbose=false, memory=4)

Wrap Krylov.jl's Dqgmres algorithm to solve iteratively the linear system $A x = y$.

source

Available linear solvers can be queried with

A default solver is provided for each vendor backend.

Block-Krylov solvers

ExaPF.jl provides an implementation of block-GMRES to solve linear systems with multiple righ-hand sides.

ExaPF.LinearSolvers.BlockGmresSolverType

Type for storing the vectors required by the in-place version of BLOCK-GMRES.

The outer constructors

solver = BlockGmresSolver(m, n, p, memory, SV, SM)
-solver = BlockGmresSolver(A, B; memory=5)

may be used in order to create these vectors. memory is set to div(n,p) if the value given is larger than div(n,p).

source
ExaPF.LinearSolvers.block_gmresFunction
(X, stats) = block_gmres(A, B; X0::AbstractMatrix{FC}; memory::Int=20, M=I, N=I,
+ldiv!(solver, x, y)
  • solver::AbstractLinearSolver: linear solver to solve the system
  • x::AbstractVector: Solution
  • A::AbstractMatrix: Input matrix
  • y::AbstractVector: RHS

Solve the linear system $A x = y$ using the algorithm specified in solver. If A is not specified, the function will used directly the factorization stored inside solver.

source

Direct solvers

ExaPF wraps UMFPACK (shipped with SuiteSparse.jl) on the CPU, and CUSPARSE on CUDA device.

ExaPF.LinearSolvers.DirectSolverType
DirectSolver <: AbstractLinearSolver

Solve linear system $A x = y$ with direct linear algebra.

  • On the CPU, DirectSolver uses UMFPACK to solve the linear system
  • On CUDA GPU, DirectSolver redirects the resolution to the method CUSOLVER.csrlsvqr
source

Iterative solvers

ExaPF.LinearSolvers.BicgstabType
Bicgstab <: AbstractIterativeLinearSolver
+Bicgstab(precond; verbose=0, rtol=1e-10, atol=1e-10)

Wrap Krylov.jl's BICGSTAB algorithm to solve iteratively the linear system $A x = y$.

source
ExaPF.LinearSolvers.DqgmresType
Dqgmres <: AbstractIterativeLinearSolver
+Dqgmres(precond; verbose=false, memory=4)

Wrap Krylov.jl's Dqgmres algorithm to solve iteratively the linear system $A x = y$.

source

Available linear solvers can be queried with

A default solver is provided for each vendor backend.

Block-Krylov solvers

ExaPF.jl provides an implementation of block-GMRES to solve linear systems with multiple righ-hand sides.

ExaPF.LinearSolvers.BlockGmresSolverType

Type for storing the vectors required by the in-place version of BLOCK-GMRES.

The outer constructors

solver = BlockGmresSolver(m, n, p, memory, SV, SM)
+solver = BlockGmresSolver(A, B; memory=5)

may be used in order to create these vectors. memory is set to div(n,p) if the value given is larger than div(n,p).

source
ExaPF.LinearSolvers.block_gmresFunction
(X, stats) = block_gmres(A, B; X0::AbstractMatrix{FC}; memory::Int=20, M=I, N=I,
                          ldiv::Bool=false, restart::Bool=false, reorthogonalization::Bool=false,
                          atol::T = √eps(T), rtol::T=√eps(T), itmax::Int=0,
-                         timemax::Float64=Inf, verbose::Int=0, history::Bool=false)

T is an AbstractFloat such as Float32, Float64 or BigFloat. FC is T or Complex{T}.

(X, stats) = block_gmres(A, B, X0::AbstractVector; kwargs...)

GMRES can be warm-started from an initial guess X0 where kwargs are the same keyword arguments as above.

Solve the linear system AX = B of size n with p right-hand sides using block-GMRES.

Input arguments

  • A: a linear operator that models a matrix of dimension n;
  • B: a matrix of size n × p.

Optional argument

  • X0: a matrix of size n × p that represents an initial guess of the solution X.

Keyword arguments

  • memory: if restart = true, the restarted version block-GMRES(k) is used with k = memory. If restart = false, the parameter memory should be used as a hint of the number of iterations to limit dynamic memory allocations. Additional storage will be allocated if the number of iterations exceeds memory;
  • M: linear operator that models a nonsingular matrix of size n used for left preconditioning;
  • N: linear operator that models a nonsingular matrix of size n used for right preconditioning;
  • ldiv: define whether the preconditioners use ldiv! or mul!;
  • restart: restart the method after memory iterations;
  • reorthogonalization: reorthogonalize the new matrices of the block-Krylov basis against all previous matrix;
  • atol: absolute stopping tolerance based on the residual norm;
  • rtol: relative stopping tolerance based on the residual norm;
  • itmax: the maximum number of iterations. If itmax=0, the default number of iterations is set to 2 * div(n,p);
  • timemax: the time limit in seconds;
  • verbose: additional details can be displayed if verbose mode is enabled (verbose > 0). Information will be displayed every verbose iterations;
  • history: collect additional statistics on the run such as residual norms.

Output arguments

  • x: a dense matrix of size n × p;
  • stats: statistics collected on the run in a BlockGmresStats.
source
+ timemax::Float64=Inf, verbose::Int=0, history::Bool=false)

T is an AbstractFloat such as Float32, Float64 or BigFloat. FC is T or Complex{T}.

(X, stats) = block_gmres(A, B, X0::AbstractVector; kwargs...)

GMRES can be warm-started from an initial guess X0 where kwargs are the same keyword arguments as above.

Solve the linear system AX = B of size n with p right-hand sides using block-GMRES.

Input arguments

Optional argument

Keyword arguments

Output arguments

source
ExaPF.LinearSolvers.block_gmres!Function
solver = block_gmres!(solver::BlockGmresSolver, B; kwargs...)
+solver = block_gmres!(solver::BlockGmresSolver, B, X0; kwargs...)

where kwargs are keyword arguments of block_gmres.

See BlockGmresSolver for more details about the solver.

source
diff --git a/dev/lib/powersystem/index.html b/dev/lib/powersystem/index.html index 6fceead3..20376948 100644 --- a/dev/lib/powersystem/index.html +++ b/dev/lib/powersystem/index.html @@ -1,5 +1,5 @@ -PowerSystem · ExaPF.jl

PowerSystem

Description

ExaPF.PowerSystem.AbstractPowerSystemType
AbstractPowerSystem

First layer of the package. Store the topology of a given transmission network, including:

  • the power injection at each bus ;
  • the admittance matrix ;
  • the default voltage at each bus.

Data are imported either from a matpower file, or a PSSE file.

source
ExaPF.PowerSystem.PowerNetworkType
PowerNetwork <: AbstractPowerSystem

This structure contains constant parameters that define the topology and physics of the power network.

The object PowerNetwork uses its own contiguous indexing for the buses. The indexing is independent from those specified in the Matpower or the PSSE input file. However, a correspondence between the two indexing (Input indexing to PowerNetwork indexing) is stored inside the attribute bus_to_indexes.

Note

The object PowerNetwork is created in the host memory. Use a AbstractFormulation object to move data to the target device.

source
ExaPF.PowerSystem.load_caseFunction
load_case(case_name, lib::PowerNetworkLibrary=EXADATA)

Convenient function to load a PowerNetwork instance from one of the benchmark library (dir=EXADATA for MATPOWER, dir=PGLIB for PGLIB-OPF). Default library is lib=EXADATA.

Examples

julia> net = PS.load_case("case118") # default is MATPOWER
+PowerSystem · ExaPF.jl

PowerSystem

Description

ExaPF.PowerSystem.AbstractPowerSystemType
AbstractPowerSystem

First layer of the package. Store the topology of a given transmission network, including:

  • the power injection at each bus ;
  • the admittance matrix ;
  • the default voltage at each bus.

Data are imported either from a matpower file, or a PSSE file.

source
ExaPF.PowerSystem.PowerNetworkType
PowerNetwork <: AbstractPowerSystem

This structure contains constant parameters that define the topology and physics of the power network.

The object PowerNetwork uses its own contiguous indexing for the buses. The indexing is independent from those specified in the Matpower or the PSSE input file. However, a correspondence between the two indexing (Input indexing to PowerNetwork indexing) is stored inside the attribute bus_to_indexes.

Note

The object PowerNetwork is created in the host memory. Use a AbstractFormulation object to move data to the target device.

source
ExaPF.PowerSystem.load_caseFunction
load_case(case_name, lib::PowerNetworkLibrary=EXADATA)

Convenient function to load a PowerNetwork instance from one of the benchmark library (dir=EXADATA for MATPOWER, dir=PGLIB for PGLIB-OPF). Default library is lib=EXADATA.

Examples

julia> net = PS.load_case("case118") # default is MATPOWER
 PowerNetwork object with:
     Buses: 118 (Slack: 1. PV: 53. PQ: 64)
     Generators: 54.
@@ -7,6 +7,6 @@
 julia> net = PS.load_case("case1354_pegase", PS.PGLIB)
 PowerNetwork object with:
     Buses: 1354 (Slack: 1. PV: 259. PQ: 1094)
-    Generators: 260.
source

API Reference

Network elements

ExaPF.PowerSystem.AbstractNetworkElementType
AbstractNetworkElement

Abstraction for all physical elements being parts of a AbstractPowerSystem. Elements are divided in

  • transmission lines (Lines)
  • buses (Buses)
  • generators (Generators)
source

List of elements:

Network attributes

List of attributes:

Query the indexing of the different elements in a given network:

Network values

List of values:

Function to get the range of a given value:

ExaPF.PowerSystem.boundsFunction
bounds(pf::AbstractPowerSystem, attr::AbstractNetworkAttribute, val::AbstractNetworkValues)

Return lower and upper bounds corresponding to the admissible values of the AbstractNetworkAttribute attr.

Examples

p_min, p_max = bounds(pf, Generator(), ActivePower())
+    Generators: 260.
source

API Reference

Network elements

ExaPF.PowerSystem.AbstractNetworkElementType
AbstractNetworkElement

Abstraction for all physical elements being parts of a AbstractPowerSystem. Elements are divided in

  • transmission lines (Lines)
  • buses (Buses)
  • generators (Generators)
source

List of elements:

Network attributes

List of attributes:

Query the indexing of the different elements in a given network:

Network values

List of values:

Function to get the range of a given value:

ExaPF.PowerSystem.boundsFunction
bounds(pf::AbstractPowerSystem, attr::AbstractNetworkAttribute, val::AbstractNetworkValues)

Return lower and upper bounds corresponding to the admissible values of the AbstractNetworkAttribute attr.

Examples

p_min, p_max = bounds(pf, Generator(), ActivePower())
 v_min, v_max = bounds(pf, Buses(), VoltageMagnitude())
-
source
+
source
diff --git a/dev/man/autodiff/index.html b/dev/man/autodiff/index.html index 015f3a40..2422ed6f 100644 --- a/dev/man/autodiff/index.html +++ b/dev/man/autodiff/index.html @@ -35,4 +35,4 @@ F[npq + i] += coef_cos * sin_val - coef_sin * cos_val end end -end

These two abstractions are a powerful tool that allow us to implement the forward mode in vectorized form where the number of directions or tangent components of a tangent variable are the number of Jacobian colors. We illustrate this in the figure below with a point-wise vector product x .* y

SIMD AD for point-wise vector product \\label{fig:simd}

This natural way of computing the compressed Jacobian yields a very high performing code that is portable to any vector architecture, given that a similar package like CUDA.jl exists. We note that similar packages for the Intel Compute Engine and AMD ROCm are in development called oneAPI.jl and AMDGPU.jl, respectively. We expect our package to be portable to AMD and Intel GPUs in the future.

+end

These two abstractions are a powerful tool that allow us to implement the forward mode in vectorized form where the number of directions or tangent components of a tangent variable are the number of Jacobian colors. We illustrate this in the figure below with a point-wise vector product x .* y

SIMD AD for point-wise vector product \\label{fig:simd}

This natural way of computing the compressed Jacobian yields a very high performing code that is portable to any vector architecture, given that a similar package like CUDA.jl exists. We note that similar packages for the Intel Compute Engine and AMD ROCm are in development called oneAPI.jl and AMDGPU.jl, respectively. We expect our package to be portable to AMD and Intel GPUs in the future.

diff --git a/dev/man/benchmark/index.html b/dev/man/benchmark/index.html index 14f62556..8f9d0a4f 100644 --- a/dev/man/benchmark/index.html +++ b/dev/man/benchmark/index.html @@ -2,4 +2,4 @@ Benchmark · ExaPF.jl

Benchmark

For the purpose of performance regression testing, ExaPF provides a lightweight benchmark script. It allows to test the various configurations for the linear solvers used in the Newton-Raphson algorithm, and run them on a specific hardware. The main julia script benchmark/benchmarks.jl takes all its options from the command line. The benchmark script takes as input a linear solver (e.g. Bicgstab), a target architecture as a KernelAbstractions object (CPU or CUDABackend), and a case filename which is included in the ExaData artifact. An exhaustive list of all available linear solvers can be obtained via ExaPF.LinearSolvers.list_solvers.

Running

julia --project benchmark/benchmarks.jl Bicgstab CUDABackend case300.m

yields

Bicgstab, CUDABackend, case300.m,  69.0,  3.57,  43.7, true

The first three fields are the settings of the benchmark run. They are followed by three timings in milliseconds:

  1. the time taken by the Newton-Raphson algorithm to solve the power flow,
  2. the timings for the Jacobian accumulation using AutoDiff,
  3. and the time for the linear solver, including the preconditioner.

To acquire these timings the code is run three times to avoid any precompilation effects. The last field confirms the Newton-Raphson convergence. In case more verbose output is desired, one has to manually set the verbosity in benchmark/benchmarks.jl by changing

powerflow_solver = NewtonRaphson(tol=ntol)

to one of the following options:

powerflow_solver = NewtonRaphson(tol=ntol, verbose=VERBOSE_LEVEL_NONE)
 powerflow_solver = NewtonRaphson(tol=ntol, verbose=VERBOSE_LEVEL_LOW)
 powerflow_solver = NewtonRaphson(tol=ntol, verbose=VERBOSE_LEVEL_MEDIUM)
-powerflow_solver = NewtonRaphson(tol=ntol, verbose=VERBOSE_LEVEL_HIGH)

A shell script benchmark/benchmarks.sh is provided to gather timings with various canonical configurations and storing them in a file cpu_REV.log and gpu_REF.log, where REV is the sha1 hash of the current checked out ExaPF version.

+powerflow_solver = NewtonRaphson(tol=ntol, verbose=VERBOSE_LEVEL_HIGH)

A shell script benchmark/benchmarks.sh is provided to gather timings with various canonical configurations and storing them in a file cpu_REV.log and gpu_REF.log, where REV is the sha1 hash of the current checked out ExaPF version.

diff --git a/dev/man/formulations/index.html b/dev/man/formulations/index.html index 50262fc4..06925128 100644 --- a/dev/man/formulations/index.html +++ b/dev/man/formulations/index.html @@ -173,4 +173,4 @@ 0.0 0.02340899999999987 0.007743999999999858 - + diff --git a/dev/man/linearsolver/index.html b/dev/man/linearsolver/index.html index 3c61a932..f950e1ee 100644 --- a/dev/man/linearsolver/index.html +++ b/dev/man/linearsolver/index.html @@ -1,3 +1,3 @@ Linear Solvers · ExaPF.jl

Linear Solver

Overview

As mentioned before, a linear solver is required to compute the Newton step in

dx .= jacobian(x)\f(x)

Our package supports the following linear solvers:

  • cuSOLVER with csrlsvqr (GPU),
  • Krylov.jl with dqgmres and bicgstab (CPU/GPU),
  • UMFPACK through the default Julia \ operator (CPU),
  • generic BiCGSTAB implementation [Vorst1992] (CPU/GPU),
  • or any linear solver wrapped in LinearAlgebra.

Preconditioning

Using only an iterative solver leads to divergence and bad performance due to ill-conditioning of the Jacobian. This is a known phenomenon in power systems. That's why this package comes with a block Jacobi preconditioner that is tailored towards GPUs and is proven to work well with power flow problems.

The block-Jacobi preconditioner used in ExaPF has been added to KrylovPreconditioners.jl

The Jacobian is partitioned into a dense block diagonal structure using Metis.jl, where each block is inverted to build our preconditioner P.

Dense block Jacobi preconditioner \\label{fig:preconditioner}

Compared to incomplete Cholesky and incomplete LU this preconditioner is easily portable to the GPU if the number of blocks is high enough. ExaPF.jl uses the batch BLAS calls from cuBLAS to invert the single blocks.

CUDA.@sync pivot, info = CUDA.CUBLAS.getrf_batched!(blocks, true)
-CUDA.@sync pivot, info, p.cuJs = CUDA.CUBLAS.getri_batched(blocks, pivot)

Assuming that other vendors will provide such batched BLAS APIs, this code is portable to other GPU architectures.

  • Vorst1992Vorst, H. A. van der. 1992. “Bi-Cgstab: A Fast and Smoothly Converging Variant of Bi-Cg for the Solution of Nonsymmetric Linear Systems.”SIAM Journal on Scientific and Statistical Computing 13 (2): 631–44
+CUDA.@sync pivot, info, p.cuJs = CUDA.CUBLAS.getri_batched(blocks, pivot)

Assuming that other vendors will provide such batched BLAS APIs, this code is portable to other GPU architectures.

diff --git a/dev/man/powersystem/index.html b/dev/man/powersystem/index.html index cec6758c..737aa0cc 100644 --- a/dev/man/powersystem/index.html +++ b/dev/man/powersystem/index.html @@ -16,4 +16,4 @@ 2 julia> PS.get(ps, PS.NumberOfSlackBuses()) -1 +1 diff --git a/dev/quickstart/index.html b/dev/quickstart/index.html index 15399cac..bce4f6dc 100644 --- a/dev/quickstart/index.html +++ b/dev/quickstart/index.html @@ -18,11 +18,11 @@ #it 5: 1.57368e-11 Power flow has converged: true * #iterations: 5 - * Time Jacobian (s) ........: 0.3846 - * Time linear solver (s) ...: 0.0020 - * update (s) ............: 0.0017 - * ldiv (s) ..............: 0.0002 - * Time total (s) ...........: 0.7734

Implicitly, ExaPF has just proceed to the following operations:

This compact syntax allows to solve quickly any powerflow equations in a few lines a code. However, in most case, the user may want more coarse grained control on the different objects manipulated.

Detailed version

In what follows, we detail step by step the detailed procedure to solve the powerflow equations.

How to load a MATPOWER instance as a PowerNetwork object?

We start by importing a MATPOWER instance to a ExaPF.PowerSystem.PowerNetwork object:

julia> pf = PS.PowerNetwork(datafile)PowerNetwork object with:
+  * Time Jacobian (s) ........: 0.8505
+  * Time linear solver (s) ...: 0.0034
+     * update (s) ............: 0.0030
+     * ldiv (s) ..............: 0.0005
+  * Time total (s) ...........: 1.6152

Implicitly, ExaPF has just proceed to the following operations:

This compact syntax allows to solve quickly any powerflow equations in a few lines a code. However, in most case, the user may want more coarse grained control on the different objects manipulated.

Detailed version

In what follows, we detail step by step the detailed procedure to solve the powerflow equations.

How to load a MATPOWER instance as a PowerNetwork object?

We start by importing a MATPOWER instance to a ExaPF.PowerSystem.PowerNetwork object:

julia> pf = PS.PowerNetwork(datafile)PowerNetwork object with:
     Buses: 1354 (Slack: 1. PV: 259. PQ: 1094)
     Generators: 260.

The different fields of the object pf specify the characteristics of the network. For instance, we can retrieve the number of buses or get the indexing of the PV buses with

julia> nbus = PS.get(pf, PS.NumberOfBuses())1354
julia> pv_indexes = pf.pv;

However, a ExaPF.PowerSystem.PowerNetwork object stores only the physical attributes of the network. To choose a given mathematical formulation, we need to pass the object pf to an ExaPF.AbstractFormulation layer. Currently, only the polar formulation is provided with the ExaPF.PolarForm structure. In the future, other formulations (e.g. RectangularForm) may be implemented as well.

How to solve the powerflow equations?

To solve the powerflow equations, we need to choose a given mathematical formulation for the equations of the network. To each formulation corresponds a given state $x$ and control $u$. Using polar representation of the voltage vector, such as $\bm{v} = |v|e^{j \theta}$, each bus $i=1, \cdots, N_B$ must satisfy the power balance equations:

\[\begin{aligned} p_i &= v_i \sum_{j}^{n} v_j (g_{ij}\cos{(\theta_i - \theta_j)} + b_{ij}\sin{(\theta_i - \theta_j})) \,, \\ @@ -63,11 +63,11 @@ #it 5: 1.57368e-11 Power flow has converged: true * #iterations: 5 - * Time Jacobian (s) ........: 0.0044 - * Time linear solver (s) ...: 0.0017 - * update (s) ............: 0.0015 - * ldiv (s) ..............: 0.0002 - * Time total (s) ...........: 0.0064

Here, the algorithm solves the powerflow equations in 5 iterations. The algorithm modifies the values of stack inplace, to avoid any unnecessary memory allocations.

How to deport the computation on the GPU?

Now, how can we deport the resolution on the GPU? The procedure looks exactly the same. It suffices to initiate a new ExaPF.PolarForm object, but on the GPU:

julia> polar_gpu = ExaPF.PolarForm(pf, CUDABackend())Polar formulation (instantiated on device CUDA.CUDAKernels.CUDABackend(false, false))
+  * Time Jacobian (s) ........: 0.0090
+  * Time linear solver (s) ...: 0.0035
+     * update (s) ............: 0.0031
+     * ldiv (s) ..............: 0.0005
+  * Time total (s) ...........: 0.0135

Here, the algorithm solves the powerflow equations in 5 iterations. The algorithm modifies the values of stack inplace, to avoid any unnecessary memory allocations.

How to deport the computation on the GPU?

Now, how can we deport the resolution on the GPU? The procedure looks exactly the same. It suffices to initiate a new ExaPF.PolarForm object, but on the GPU:

julia> polar_gpu = ExaPF.PolarForm(pf, CUDABackend())Polar formulation (instantiated on device CUDA.CUDAKernels.CUDABackend(false, false))
 Network characteristics:
     #buses:      1354  (#slack: 1  #PV: 259  #PQ: 1094)
     #generators: 260
@@ -83,11 +83,11 @@
 #it 5: 9.96622e-12
 Power flow has converged: true
   * #iterations: 5
-  * Time Jacobian (s) ........: 2.3396
-  * Time linear solver (s) ...: 0.2184
+  * Time Jacobian (s) ........: 4.4577
+  * Time linear solver (s) ...: 0.2820
      * update (s) ............: 0.0000
-     * ldiv (s) ..............: 0.2184
-  * Time total (s) ...........: 3.3278

Note that we get exactly the same iterations as when we solve the power flow equations on the CPU.

How to solve the linear system with BICGSTAB?

By default, the algorithm runs with a direct solver, which might be inefficient for large problems. To overcome this issue, ExaPF implements a wrapper for different iterative algorithms (GMRES, BICGSTAB).

The performance of iterative solvers is usually improved if we use a preconditioner. ExaPF.jl implements an overlapping Schwarz preconditioner, tailored for GPU usage. To build an instance with 8 blocks, just write

julia> npartitions = 8;
julia> jac_gpu = jx_gpu.J;
julia> precond = BlockJacobiPreconditioner(jac_gpu, npartitions, CUDABackend());ERROR: UndefVarError: `BlockJacobiPreconditioner` not defined

You can attach the preconditioner to an BICGSTAB algorithm simply as

julia> linear_solver = ExaPF.Bicgstab(jac_gpu; P=precond);ERROR: UndefVarError: `precond` not defined

(this will use the BICGSTAB algorithm implemented in Krylov.jl).

We need to update accordingly the tolerance of the Newton-Raphson algorithm (the iterative solver is less accurate than the direct solver):

julia> pf_algo = NewtonRaphson(; verbose=1, tol=1e-7)NewtonRaphson(20, 1.0e-7, 1)

We reset the variables to their initial values:

julia> ExaPF.init!(polar_gpu, stack_gpu)

Then, solving the power flow with the iterative solvers directly translates to one call to nlsolve!:

julia> convergence = ExaPF.nlsolve!(pf_algo, jx_gpu, stack_gpu; linear_solver=linear_solver)#it 0: 1.15103e+02
+     * ldiv (s) ..............: 0.2820
+  * Time total (s) ...........: 6.1934

Note that we get exactly the same iterations as when we solve the power flow equations on the CPU.

How to solve the linear system with BICGSTAB?

By default, the algorithm runs with a direct solver, which might be inefficient for large problems. To overcome this issue, ExaPF implements a wrapper for different iterative algorithms (GMRES, BICGSTAB).

The performance of iterative solvers is usually improved if we use a preconditioner. ExaPF.jl implements an overlapping Schwarz preconditioner, tailored for GPU usage. To build an instance with 8 blocks, just write

julia> npartitions = 8;
julia> jac_gpu = jx_gpu.J;
julia> precond = BlockJacobiPreconditioner(jac_gpu, npartitions, CUDABackend());ERROR: UndefVarError: `BlockJacobiPreconditioner` not defined

You can attach the preconditioner to an BICGSTAB algorithm simply as

julia> linear_solver = ExaPF.Bicgstab(jac_gpu; P=precond);ERROR: UndefVarError: `precond` not defined

(this will use the BICGSTAB algorithm implemented in Krylov.jl).

We need to update accordingly the tolerance of the Newton-Raphson algorithm (the iterative solver is less accurate than the direct solver):

julia> pf_algo = NewtonRaphson(; verbose=1, tol=1e-7)NewtonRaphson(20, 1.0e-7, 1)

We reset the variables to their initial values:

julia> ExaPF.init!(polar_gpu, stack_gpu)

Then, solving the power flow with the iterative solvers directly translates to one call to nlsolve!:

julia> convergence = ExaPF.nlsolve!(pf_algo, jx_gpu, stack_gpu; linear_solver=linear_solver)#it 0: 1.15103e+02
 #it 1: 1.50328e+01
 #it 2: 5.88242e-01
 #it 3: 4.88493e-03
@@ -95,8 +95,8 @@
 #it 5: 9.96622e-12
 Power flow has converged: true
   * #iterations: 5
-  * Time Jacobian (s) ........: 0.0024
-  * Time linear solver (s) ...: 0.1176
+  * Time Jacobian (s) ........: 0.0044
+  * Time linear solver (s) ...: 0.1485
      * update (s) ............: 0.0000
-     * ldiv (s) ..............: 0.1176
-  * Time total (s) ...........: 0.1215
+ * ldiv (s) ..............: 0.1485 + * Time total (s) ...........: 0.3076 diff --git a/dev/tutorials/batch_evaluation/index.html b/dev/tutorials/batch_evaluation/index.html index 768993ab..59ebcc71 100644 --- a/dev/tutorials/batch_evaluation/index.html +++ b/dev/tutorials/batch_evaluation/index.html @@ -21,26 +21,26 @@ #states : 14

Then, ExaPF can also instantiate a NetworkStack object, with the memory required to store the variables of the different scenarios:

blk_stack = ExaPF.NetworkStack(blk_polar)
210-elements NetworkStack{Vector{Float64}}

We can pass the scenarios manually using the function set_params!:

ploads = rand(nbus, nscen);
 qloads = rand(nbus, nscen);
 ExaPF.set_params!(blk_stack, ploads, qloads)
90-element Vector{Float64}:
- 0.33598725420881626
- 0.9048478597310726
- 0.07356989081238319
- 0.8331935218054993
- 0.17565143887936274
- 0.3553618951599068
- 0.8498403552038375
- 0.8618875424357935
- 0.8675822676996401
- 0.6976813322785124
+ 0.6697538816980037
+ 0.8245218163604555
+ 0.9777324872706514
+ 0.14585475550711802
+ 0.9666785738539263
+ 0.7359206415744382
+ 0.4561648273783737
+ 0.7319330411083398
+ 0.6506180811204872
+ 0.5178087552724356
  ⋮
- 0.5886803522651024
- 0.6875324039766139
- 0.8457821523095654
- 0.6875991910223772
- 0.13424848485931062
- 0.5952959751038129
- 0.006726101052284772
- 0.47702901261077757
- 0.4194973057258936

The structure blk_stack stores $N$ different realizations for the variables stored in the field input (vmag, vang and pgen). By default, the initial values are set according to the values specified in blk_polar (usually defined when importing the data from the instance file):

reshape(blk_stack.vmag, nbus, nscen)
9×10 Matrix{Float64}:
+ 0.7157135266087696
+ 0.21939565333992983
+ 0.9593834083478856
+ 0.41907073351065827
+ 0.8163313102504259
+ 0.23811557333035738
+ 0.7472331162295253
+ 0.8193283425420732
+ 0.36634515621744934

The structure blk_stack stores $N$ different realizations for the variables stored in the field input (vmag, vang and pgen). By default, the initial values are set according to the values specified in blk_polar (usually defined when importing the data from the instance file):

reshape(blk_stack.vmag, nbus, nscen)
9×10 Matrix{Float64}:
  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
@@ -50,32 +50,32 @@
  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0

Only the parameters are varying according to the scenarios we passed as input in the constructor:

reshape(blk_stack.pload, nbus, nscen)
9×10 Matrix{Float64}:
- 0.2486     0.210302  0.85853    …  0.269604   0.652289    0.443085
- 0.239969   0.387662  0.930848      0.910276   0.632548    0.844093
- 0.69726    0.76205   0.0213634     0.664865   0.038839    0.0769704
- 0.455942   0.923687  0.915852      0.707308   0.94468     0.301472
- 0.25607    0.442681  0.380445      0.0404338  0.324224    0.270543
- 0.821003   0.864514  0.479715   …  0.480175   0.45731     0.659751
- 0.277154   0.780042  0.218433      0.791215   0.00435991  0.891897
- 0.502443   0.356332  0.866679      0.686162   0.902474    0.547917
- 0.0908686  0.628651  0.349603      0.116541   0.778358    0.675234

Evaluate expressions in block

ExaPF takes advantage of the block structure when using a BlockPolarForm.

As an example, suppose we want to evaluate the power flow balances in block form with a PowerFlowBalance expression:

powerflow = ExaPF.PowerFlowBalance(blk_polar) ∘ ExaPF.PolarBasis(blk_polar);
ExaPF.ComposedExpressions{ExaPF.PolarBasis{Vector{Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}}, ExaPF.PowerFlowBalance{Vector{Float64}, SparseArrays.SparseMatrixCSC{Float64, Int64}}}(PolarBasis (AbstractExpression), PowerFlowBalance (AbstractExpression))

A block evaluation takes as input the NetworkStack blk_stack structure:

m = div(length(powerflow), nscen);
+ 0.958678  0.0903407  0.133899   0.599352  …  0.340964  0.503618   0.0893943
+ 0.468321  0.776612   0.804706   0.361663     0.410795  0.359617   0.420191
+ 0.695002  0.991253   0.986575   0.452901     0.438116  0.835842   0.458455
+ 0.517506  0.218808   0.393388   0.689627     0.749262  0.834904   0.13837
+ 0.375501  0.828902   0.0379287  0.250872     0.835372  0.658289   0.634727
+ 0.539583  0.778415   0.128258   0.552139  …  0.580944  0.497907   0.463098
+ 0.353267  0.329615   0.43767    0.750991     0.170245  0.506544   0.886759
+ 0.984587  0.472418   0.939126   0.907906     0.810077  0.686407   0.109425
+ 0.406538  0.52       0.462897   0.894714     0.931304  0.0276674  0.72468

Evaluate expressions in block

ExaPF takes advantage of the block structure when using a BlockPolarForm.

As an example, suppose we want to evaluate the power flow balances in block form with a PowerFlowBalance expression:

powerflow = ExaPF.PowerFlowBalance(blk_polar) ∘ ExaPF.PolarBasis(blk_polar);
ExaPF.ComposedExpressions{ExaPF.PolarBasis{Vector{Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}}, ExaPF.PowerFlowBalance{Vector{Float64}, SparseArrays.SparseMatrixCSC{Float64, Int64}}}(PolarBasis (AbstractExpression), PowerFlowBalance (AbstractExpression))

A block evaluation takes as input the NetworkStack blk_stack structure:

m = div(length(powerflow), nscen);
 blk_output = zeros(m * nscen);
 powerflow(blk_output, blk_stack); # inplace evaluation
 reshape(blk_output, m, nscen)
14×10 Matrix{Float64}:
- -1.39003    -1.24234    -0.699152  …  -0.719724   -0.997452    -0.785907
- -0.15274    -0.0879504  -0.828637     -0.185135   -0.811161    -0.77303
-  0.455942    0.923687    0.915852      0.707308    0.94468      0.301472
-  0.25607     0.442681    0.380445      0.0404338   0.324224     0.270543
-  0.821003    0.864514    0.479715      0.480175    0.45731      0.659751
-  0.277154    0.780042    0.218433  …   0.791215    0.00435991   0.891897
-  0.502443    0.356332    0.866679      0.686162    0.902474     0.547917
-  0.0908686   0.628651    0.349603      0.116541    0.778358     0.675234
-  0.666194    0.516219    0.83118       0.335102   -0.137093     0.520599
- -0.0823486   0.482191    0.357863     -0.239241    0.0930056   -0.123752
-  0.0718619   0.408884   -0.104931  …   0.656271   -0.159315     0.311796
-  0.67084     0.129507    0.555063      0.0931154   0.433941    -0.172274
-  0.634388    0.252692    0.200241     -0.116585    0.323527     0.249529
-  0.626582    0.738579    0.742102      0.66543    -0.0106097    0.178497

We get $N$ different results for the power flow balance equations, depending on which scenario we are on.

Solve power flow in block on the CPU

Once the different structures used for block evaluation instantiated, one is able to solve the power flow in block on the CPU using the same function nlsolve!. The block Jacobian is evaluated with automatic differentiation using a ArrowheadJacobian structure:

blk_jx = ExaPF.ArrowheadJacobian(blk_polar, powerflow, State());
+ -1.16168    -0.853388   -0.825294   …  -1.2192     -1.27038    -1.20981
+ -0.154998    0.141253    0.136575      -0.411884   -0.0141577  -0.391545
+  0.517506    0.218808    0.393388       0.749262    0.834904    0.13837
+  0.375501    0.828902    0.0379287      0.835372    0.658289    0.634727
+  0.539583    0.778415    0.128258       0.580944    0.497907    0.463098
+  0.353267    0.329615    0.43767    …   0.170245    0.506544    0.886759
+  0.984587    0.472418    0.939126       0.810077    0.686407    0.109425
+  0.406538    0.52        0.462897       0.931304    0.0276674   0.72468
+ -0.0211452  -0.0347191   0.527724       0.383911    0.273248    0.252071
+  0.708679   -0.0945452   0.395512       0.477896   -0.0969667   0.558331
+  0.452421    0.346115    0.556259   …  -0.0245019   0.547899   -0.0453844
+  0.277165   -0.171751   -0.0759283      0.0539596   0.330881    0.568233
+  0.504433   -0.0131989   0.325656       0.737838   -0.0145656   0.591828
+  0.409618    0.0254697   0.18463       -0.158851   -0.103321    0.125345

We get $N$ different results for the power flow balance equations, depending on which scenario we are on.

Solve power flow in block on the CPU

Once the different structures used for block evaluation instantiated, one is able to solve the power flow in block on the CPU using the same function nlsolve!. The block Jacobian is evaluated with automatic differentiation using a ArrowheadJacobian structure:

blk_jx = ExaPF.ArrowheadJacobian(blk_polar, powerflow, State());
 blk_jx.J
140×140 SparseArrays.SparseMatrixCSC{Float64, Int64} with 820 stored entries:
 ⎡⡱⣮⡲⣞⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
 ⎢⣸⢮⣻⣾⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
@@ -122,22 +122,22 @@
     blk_jx,
     blk_stack;
 )
Power flow has converged: true
-  * #iterations: 5
-  * Time Jacobian (s) ........: 0.0002
+  * #iterations: 4
+  * Time Jacobian (s) ........: 0.0004
   * Time linear solver (s) ...: 0.0001
-     * update (s) ............: 0.0000
+     * update (s) ............: 0.0001
      * ldiv (s) ..............: 0.0000
-  * Time total (s) ...........: 0.0004
+  * Time total (s) ...........: 0.0008
 

At the solution, we get different values for the voltage magnitudes at the PQ nodes:

reshape(blk_stack.vmag, nbus, nscen)
9×10 Matrix{Float64}:
  1.0       1.0       1.0       1.0       …  1.0       1.0       1.0
  1.0       1.0       1.0       1.0          1.0       1.0       1.0
  1.0       1.0       1.0       1.0          1.0       1.0       1.0
- 0.93172   0.905282  0.906795  0.960814     0.959699  0.990906  0.965831
- 0.938068  0.865757  0.892571  0.939146     0.969484  0.979818  0.972148
- 0.950652  0.930867  0.959356  0.951322  …  0.954321  0.989856  0.976045
- 0.896929  0.920468  0.918053  0.954509     0.952812  0.958612  0.976883
- 0.921022  0.93629   0.936636  0.960372     0.968145  0.970158  0.974516
- 0.883828  0.85759   0.86233   0.965191     0.918888  0.976346  0.9509

Solve power flow in batch on the GPU

When the BlockPolarForm model is instantiated on the GPU, the expressions are evaluated in batch. The syntax to solve the power flow equations is exactly the same as on the CPU, using cusolverRF to solve the different linear systems:

using CUDA, CUSOLVERRF
+ 0.943149  0.991004  0.943895  0.970019     0.950717  0.981209  0.946827
+ 0.882305  0.974979  0.915486  0.95431      0.910745  0.968595  0.900214
+ 0.935602  0.974776  0.948855  0.938474  …  0.969858  0.958537  0.954929
+ 0.919813  0.98952   0.9546    0.899879     0.955401  0.951019  0.910012
+ 0.936187  0.992665  0.958417  0.932701     0.952259  0.978334  0.936195
+ 0.90983   0.985336  0.931598  0.956808     0.950272  0.986036  0.926558

Solve power flow in batch on the GPU

When the BlockPolarForm model is instantiated on the GPU, the expressions are evaluated in batch. The syntax to solve the power flow equations is exactly the same as on the CPU, using cusolverRF to solve the different linear systems:

using CUDA, CUSOLVERRF
 polar_gpu = ExaPF.load_polar("case9.m", CUDABackend());
 blk_polar_gpu = ExaPF.BlockPolarForm(polar_gpu, nscen); # load model on GPU
 blk_stack_gpu = ExaPF.NetworkStack(blk_polar_gpu);
@@ -154,10 +154,10 @@
     blk_stack_gpu;
     linear_solver=rf_solver,
 )
Power flow has converged: true
-  * #iterations: 5
-  * Time Jacobian (s) ........: 0.0016
-  * Time linear solver (s) ...: 0.0479
-     * update (s) ............: 0.0240
-     * ldiv (s) ..............: 0.0239
-  * Time total (s) ...........: 0.0516
-
+ * #iterations: 4 + * Time Jacobian (s) ........: 0.0021 + * Time linear solver (s) ...: 0.1126 + * update (s) ............: 0.0544 + * ldiv (s) ..............: 0.0582 + * Time total (s) ...........: 0.1184 + diff --git a/dev/tutorials/direct_solver/index.html b/dev/tutorials/direct_solver/index.html index 69d720a0..e0396906 100644 --- a/dev/tutorials/direct_solver/index.html +++ b/dev/tutorials/direct_solver/index.html @@ -15,20 +15,20 @@ jx = ExaPF.Jacobian(polar, func, State()) # init AD ExaPF.nlsolve!(pf_solver, jx, stack)
Power flow has converged: true
   * #iterations: 7
-  * Time Jacobian (s) ........: 1.1230
-  * Time linear solver (s) ...: 0.0333
-     * update (s) ............: 0.0290
-     * ldiv (s) ..............: 0.0043
-  * Time total (s) ...........: 1.1624
+  * Time Jacobian (s) ........: 2.4500
+  * Time linear solver (s) ...: 0.0457
+     * update (s) ............: 0.0398
+     * ldiv (s) ..............: 0.0059
+  * Time total (s) ...........: 2.5052
 

KLU (CPU)

KLU is an efficient sparse linear solver, initially designed for circuit simulation problems. It is often considered as one of the state-of-the-art linear solver to solve power flow problems. Conveniently, KLU is wrapped in Julia with the package KLU.jl. KLU.jl implements a proper interface to use KLU. We just have to implement a forgiving function for LinearAlgebra.lu!

LinearAlgebra.lu!(K::KLU.KLUFactorization, J) = KLU.klu!(K, J)

Then, we are ready to solve a power flow with KLU using our current abstraction. One has just to create a new instance of LS.DirectSolver:

klu_factorization = KLU.klu(jx.J)
-klu_solver = LS.DirectSolver(klu_factorization)
ExaPF.LinearSolvers.DirectSolver{KLU.KLUFactorization{Float64, Int64}}(KLU.KLUFactorization{Float64, Int64}(KLU.LibKLU.klu_l_common_struct(0.001, 1.2, 1.2, 10.0, 0.0, 1, 0, 2, Ptr{Nothing} @0x0000000000000000, Ptr{Nothing} @0x0000000000000000, 1, 0, 0, 17036, 17036, 17036, 0, -1.0, -1.0, -1.0, -1.0, 0.0, 0x00000000005a0030, 0x0000000000760150), Ptr{Nothing} @0x0000000012322ed0, Ptr{Nothing} @0x0000000011ec6fe0, 17036, [0, 4, 12, 19, 24, 27, 37, 40, 44, 54  …  129350, 129359, 129363, 129369, 129380, 129384, 129390, 129402, 129408, 129412], [0, 383, 5548, 13344, 1, 458, 4027, 5306, 6117, 11823  …  6009, 6175, 9238, 13805, 13971, 17034, 8879, 9239, 16675, 17035], [1050.4551624894714, -629.5373899630948, -421.251258755022, 88.40062041224643, 995.2128749049238, -479.4136240982223, -138.29482949968133, -145.95963146384614, -230.91975549270308, 22.972728768572996  …  -480.47843944268664, -39.71137052069115, 520.1642960016878, -3360.900457400674, -208.11090120000475, 3568.982258670069, -39.40403545216686, 39.40522422897141, -242.8021538639664, 242.80196093598863]))

and pass it to nlsolve!:

ExaPF.init!(polar, stack) # reinit stack
+klu_solver = LS.DirectSolver(klu_factorization)
ExaPF.LinearSolvers.DirectSolver{KLU.KLUFactorization{Float64, Int64}}(KLU.KLUFactorization{Float64, Int64}(KLU.LibKLU.klu_l_common_struct(0.001, 1.2, 1.2, 10.0, 0.0, 1, 0, 2, Ptr{Nothing} @0x0000000000000000, Ptr{Nothing} @0x0000000000000000, 1, 0, 0, 17036, 17036, 17036, 0, -1.0, -1.0, -1.0, -1.0, 0.0, 0x00000000005a0030, 0x0000000000760150), Ptr{Nothing} @0x000000000ef987e0, Ptr{Nothing} @0x000000001174a860, 17036, [0, 4, 12, 19, 24, 27, 37, 40, 44, 54  …  129350, 129359, 129363, 129369, 129380, 129384, 129390, 129402, 129408, 129412], [0, 383, 5548, 13344, 1, 458, 4027, 5306, 6117, 11823  …  6009, 6175, 9238, 13805, 13971, 17034, 8879, 9239, 16675, 17035], [1050.4551624894714, -629.5373899630948, -421.251258755022, 88.40062041224643, 995.2128749049238, -479.4136240982223, -138.29482949968133, -145.95963146384614, -230.91975549270308, 22.972728768572996  …  -480.47843944268664, -39.71137052069115, 520.1642960016878, -3360.900457400674, -208.11090120000475, 3568.982258670069, -39.40403545216686, 39.40522422897141, -242.8021538639664, 242.80196093598863]))

and pass it to nlsolve!:

ExaPF.init!(polar, stack) # reinit stack
 ExaPF.nlsolve!(pf_solver, jx, stack; linear_solver=klu_solver)
Power flow has converged: true
   * #iterations: 7
-  * Time Jacobian (s) ........: 0.2776
-  * Time linear solver (s) ...: 0.0333
-     * update (s) ............: 0.0291
-     * ldiv (s) ..............: 0.0042
-  * Time total (s) ...........: 0.3168
+  * Time Jacobian (s) ........: 0.3910
+  * Time linear solver (s) ...: 0.0437
+     * update (s) ............: 0.0384
+     * ldiv (s) ..............: 0.0053
+  * Time total (s) ...........: 0.4431
 

We observe KLU reduces considerably the time spent in the linear solver.

cusolverRF (CUDA)

cusolverRF is an efficient LU refactorization routine implemented in CUDA. It is wrapped in Julia inside the package CUSOLVERRF.jl:

using CUSOLVERRF

The principle is the following: the initial symbolic factorization is computed on the CPU with the routine chosen by the user. Then, each time we have to refactorize a matrix with the same sparsity pattern, we can recompute the numerical factorization entirely on the GPU. In practice, this solver is efficient at refactorizing a given matrix if the sparsity is significant.

This is of direct relevance for us, as (i) the sparsity of the power flow Jacobian doesn't change along the Newton iterations and (ii) the Jacobian is super-sparse. In ExaPF, it is the linear solver of choice when it comes to solve the power flow entirely on the GPU.

CUSOLVERRF.jl follows the LinearAlgebra's interface, so we can use it directly in ExaPF. We first have to instantiate everything on the GPU:

using CUDA
 polar_gpu = ExaPF.load_polar("case9241pegase.m", CUDABackend())
 stack_gpu = ExaPF.NetworkStack(polar_gpu)
@@ -38,9 +38,9 @@
 rf_solver = LS.DirectSolver(rf_fac)
ExaPF.LinearSolvers.DirectSolver{CUSOLVERRF.RFLU{Float64, CUSOLVERRF.CuSparseSVDeprecated, CUSOLVERRF.CuSparseSMDeprecated}}(CUSOLVERRF.RFLU{Float64, CUSOLVERRF.CuSparseSVDeprecated, CUSOLVERRF.CuSparseSMDeprecated}
 )

Then, we are able to solve the power flow entirely on the GPU, simply as

ExaPF.nlsolve!(pf_solver, jx_gpu, stack_gpu; linear_solver=rf_solver)
Power flow has converged: true
   * #iterations: 7
-  * Time Jacobian (s) ........: 1.6907
-  * Time linear solver (s) ...: 0.1169
-     * update (s) ............: 0.1153
-     * ldiv (s) ..............: 0.0015
-  * Time total (s) ...........: 1.8101
-
+ * Time Jacobian (s) ........: 3.7973 + * Time linear solver (s) ...: 0.1531 + * update (s) ............: 0.1497 + * ldiv (s) ..............: 0.0034 + * Time total (s) ...........: 3.9567 +