Regular update #40

cjknight · 2023-10-03T17:15:53Z

High-level summary:

Sync with dev branch
Fixed typo in polymer example: now 1-to-1 agreement with CUDA backend up to 8 fragments (i.e. energies & number of iterations match)
Cleaned up cuda backend, some data transfer, optimizations, some syncing with openmptarget backend
First pass on OpenMPTarget backend with support for NVHPC and LLVM compilers: still work in progress (debugging device detection and data management)

Test is currently failing; energy returned by excitations solver does not correspond to any eigenvalue of the lassi Hamiltonian so this is an error, not a local-minimum problem at the moment.

It was previously discarding the between-fragment part of the effective 1-electron operator, which is obviously not right

Self-consistency achieved, but it's stuck on local minima now.

Local minimum problem from previous commit fixed for 9 of the 11 perturber rootspaces.

Should not report failure to find global minimum in CI jobs, since I am still developing the algorithm.

Remove the unittest cases where it's numerically very difficult to avoid local minima, because they reflect unlikely use cases (direct spin excitation and double charge transfer).

set_excited_fragment_ now takes the value of the quantum numbers instead of the delta from the q-space, since there will be multiple q-spaces in the future. "excited" is now used in place of "active" throughout the code to clarify the relationship between the CAS and the space of excited fragments. docstring for set_excited_fragment_

Allows the Q space to have more than one rootspace in it, so that the stabilizing coupling of m,m' <-> m+1,m'-1 rootspaces is fully expressed in the vrvsolver.

Using the ExcitationSolver. Things that enabled this are 1) loosen self-consistency requirement of initial guess H(E)|Psi> = E|Psi> and 2) do not greedily solve H(E)|Psi> = E|Psi> for each fragment, but instead only do one pass of the undressed FCISolver kernel per VRVSolver kernel call. It is still very slow. As I learned in grad school, the fixed point algorithm for this fundamentally DOES NOT WORK when coupling is strong, at least as far as the energy determination is concerned. The good news is that H(E)|Psi> = E|Psi> is a strictly one-dimensional monotonic function between singularities so rootfinding it more properly should be easy.

Re-solve the self-consistent H(E)|Psi> = E|Psi> properly with fixed CI vectors on every cycle of the VRVSolver nonlinear iteration.

Update denom_q immediately upon computing e0; cleanup debugging stuff. There is still some failure to converge for alfefe nmax=0.

Slow calc'n compared to normal seems almost entirely due to repeated lassi op_o? function calls... is there any way to improve this?

- delete old commented code - e_p -> e0 consistently when I'm referring to the solution of self-consistent equations - the validity threshold of lowest_refovlp_eigval as a kwarg - some comment clarity

- _check_init_guess -> get_init_guess - sort_ci0 -> actually only sorts ci0 without also calling get_ham_pq

The VRV operators, H_PQ.(E-H_QQ)^-1.H_QP, will be averged over different P indices: sum_P w_P H_PQ.(E-H_QQ)^-1.H_QP

Doesn't reliably converge to anything, probably because I'm not quite understanding the equations I'm actually trying to solve when I do this. Needs theoretical development

in debug output in case of convergence failure

To permit the underlying _las object to be untouched by model-space manipulations.

Generates sz-rotated CI vectors directly from origin CI vectors.

Uses ExcitationPSFCISolver for charge-separated rootspaces, currently with equal-weights state-averaging. TODO: spin fluctations and unittests.

There is another warning for failure to converge anyway.

I can't believe I didn't have this before.

Copy the logic from lasci_sync.py over to lasscf_rdm.py

The lasscf_rdm kernel will now actually make sure that the coupled fragment subproblems, which interact with each other via mean- field ("jk") effects, have reached a fixed point before announcing successful convergence. This is accomplished by a second micro- iteration in which the orbitals are fixed by the fcibox kernels are called repeatedly, taking the updated outputs of their neighbors as inputs. Control the behavior of this new microiteration with the "conv_tol_rdmjkddm", "conv_tol_rdmjkde", and "max_cycle_rdmjk" attributes of the LASSCF method object. Also reduce default "max_cycle_micro" to 3 from 5, since "max_cycle_micro" and "max_cycle_rdmjk" handle separately what the lasci_sync kernel handles simultaneously with "max_cycle_micro."

1. more conservative guardrail than last commit 2. transform_opmat_det2csf_pspace use integer range rather than bool array for indexing, since everything is supposed to be contiguous anyway 3. demote the eigenvalue test to even higher debug, since it becomes impossible fast

In case (esp in the excitation solver) it makes more sense to only optimize one fragment at a time.

MatthewRHermes and others added 30 commits September 1, 2023 11:45

lassi excitations solver lassi kernel equiv test

34f5866

Test is currently failing; energy returned by excitations solver does not correspond to any eigenvalue of the lassi Hamiltonian so this is an error, not a local-minimum problem at the moment.

bugfix lassi excitation solver get_active_h

f49ff04

It was previously discarding the between-fragment part of the effective 1-electron operator, which is obviously not right

lassi excitations solver unittest update

27db98d

Self-consistency achieved, but it's stuck on local minima now.

lassi excitations self-consistency improvements

c3b4337

Local minimum problem from previous commit fixed for 9 of the 11 perturber rootspaces.

deactivate local-minimum checking in unittest

578eafb

Should not report failure to find global minimum in CI jobs, since I am still developing the algorithm.

PySCF compatibility check

e0af54f

lassi docstrings and vrvsolver code cleanup

2b42efb

lassi vrvsolver warn behavior and unittest tweak

6e0c9a7

Remove the unittest cases where it's numerically very difficult to avoid local minima, because they reflect unlikely use cases (direct spin excitation and double charge transfer).

cleanup lassi excitations solver

36dc6c8

lassi excitation solver docstrings

edbc299

lassi excitation solver multiref safety commit

83d3b8f

LASSI excitation solver multiref testing

535fdbc

Allows the Q space to have more than one rootspace in it, so that the stabilizing coupling of m,m' <-> m+1,m'-1 rootspaces is fully expressed in the vrvsolver.

indexing bugfixes in lassi excitations sort_ci0

8701caf

lassi excitation solver debugging safety commit

ae139b3

lassi excitations vrvsolver solve_e0 function

00b36fc

Re-solve the self-consistent H(E)|Psi> = E|Psi> properly with fixed CI vectors on every cycle of the VRVSolver nonlinear iteration.

Merge branch 'dev' into vrvsolver_scenergy

60753d9

lassi excitation solver debug fiddle

878a617

lassi excitations vrvsolver cleanup

d6e0d5b

Update denom_q immediately upon computing e0; cleanup debugging stuff. There is still some failure to converge for alfefe nmax=0.

lassi excitation safety commit

a6df4cf

LASSI excitation solver timer logs

f32a464

Slow calc'n compared to normal seems almost entirely due to repeated lassi op_o? function calls... is there any way to improve this?

lassi excitations code cleanup

4c3975a

- delete old commented code - e_p -> e0 consistently when I'm referring to the solution of self-consistent equations - the validity threshold of lowest_refovlp_eigval as a kwarg - some comment clarity

lasscf productstate lassi excitations code cleanup

49296ae

- _check_init_guess -> get_init_guess - sort_ci0 -> actually only sorts ci0 without also calling get_ham_pq

lassi excitation solver inner state-averaging

7b225e3

The VRV operators, H_PQ.(E-H_QQ)^-1.H_QP, will be averged over different P indices: sum_P w_P H_PQ.(E-H_QQ)^-1.H_QP

lassi excitation solver inner sa debugging

07bc66b

Doesn't reliably converge to anything, probably because I'm not quite understanding the equations I'm actually trying to solve when I do this. Needs theoretical development

lassi excitation solver safety commit

ba89944

update polymer inputs to use C 2pz

c50edd2

missing rt header? and some optimizations

86bd34c

lassi excitations solver safety commit

dce230f

MatthewRHermes and others added 27 commits September 15, 2023 18:14

remove e0_p member of vrvsolver and init in kernel

e281232

PySCF(-forge) compatibility check & update

981bfef

lasci productstate solver more diagnostics

b6d511b

in debug output in case of convergence failure

small cleanup

30e793a

add pull_get_jk() function

dc39f8c

device copy of nset and nao

d68de20

cleanup

8d65590

cleanup & sync OpenMPTarget backend

47c6e27

cleanup fdrv & sync OpenMPTarget backend

e29373b

debug log in lassi excitations solver

d2d26fa

LASSI class fns use LASSI dict instead of LASSCF

abeebfe

To permit the underlying _las object to be untouched by model-space manipulations.

LASSI spin_shuffle_ci fn

1a735e3

Generates sz-rotated CI vectors directly from origin CI vectors.

Create my_pyscf/lassi/lassis.py

0c40425

Uses ExcitationPSFCISolver for charge-separated rootspaces, currently with equal-weights state-averaging. TODO: spin fluctations and unittests.

LASSIS syntax unittests

17f5068

silence lassi excitationsolver warning

807e049

There is another warning for failure to converge anyway.

Create tests/fci/test_csf.py

4261a60

I can't believe I didn't have this before.

lasscf_rdm microiteration divergence catching

7791a05

Copy the logic from lasci_sync.py over to lasscf_rdm.py

Memory usage guardrail csf pspace

843ec6c

default max_memory in pspace fn

99c9412

LASSI producstate serialfrag option

4a181f5

In case (esp in the excitation solver) it makes more sense to only optimize one fragment at a time.

PySCF compatibility check

b3185fe

more progress on openmptarget backend; not functional yet

96c0822

nvhpc/openmp arch file

fea405d

Merge branch 'dev' of https://github.com/MatthewRHermes/mrh into gpudev

3456905

OpenMPTarget backend now enabled; time to debug correctness

d3e9989

MatthewRHermes merged commit baa8c10 into MatthewRHermes:gpu Oct 3, 2023
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regular update #40

Regular update #40

cjknight commented Oct 3, 2023

Regular update #40

Regular update #40

Conversation

cjknight commented Oct 3, 2023