-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/batched quda deflation #76
Open
leonhostetler
wants to merge
6
commits into
milc-qcd:develop
Choose a base branch
from
leonhostetler:feature/batched_quda_deflation
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Feature/batched quda deflation #76
leonhostetler
wants to merge
6
commits into
milc-qcd:develop
from
leonhostetler:feature/batched_quda_deflation
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This looks great, Leon. Your explanation of the details is super.
Thanks,
Steve
On Dec 22, 2024, at 9:41 AM, Leon Hostetler ***@***.***> wrote:
This pull request implements QUDA deflation for ks_spectrum with support for multiple right-hand sides. Previously, MILC deflation was done on CPU.
Key points:
1. To use all of the features, ks_spectrum must be compiled with WANT_QUDA, WANT_FN_CG_GPU, and WANT_EIG_GPU all true
2. QUDA deflation is implemented for UML, CG, and CGZ, for single and multiple right-hand sides but will only apply to the even parity solves
3. Eigenvector files are loaded and saved directly by QUDA--MILC's corresponding functions are bypassed
4. Using fresh_ks_eigen with ks_spectrum will trigger QUDA's eigensolve internally. MILC's eigensolve functions are bypassed
5. This functionality depends on changes made from the QUDA side as well. Until those are merged into QUDA develop, you can use the leonhostetler/milc_batched_deflation branch of https://github.com/leonhostetler/quda.git
More details:
Using ks_spectrum with fresh_ks_eigen is now working. So there is no longer a need to do a two-part process where the eigenvectors are generated using QUDA's standalone eigensolver and then using MILC's ks_spectrum to load the eigenvectors and do the deflation. The ks_spectrum application can now handle both the eigensolve and CG solves in the same run.
This is implemented for UML, CG, and CGZ, however, deflation only occurs for the even parity solves. For UML, where the odd parity solve is just a polishing of the odd solution reconstructed from the even solution, this works well since the odd solve typically requires many fewer iterations. However, for CG and CGZ, this means that only the even half of the problem will be sped up by deflation. If there is a need for odd parity deflation, we'll need to think about how best to implement that in the future.
Note that eigenvector files are loaded and saved from within QUDA--not MILC. This was both the simplest way to interface with the QUDA solver and the way that ensured minimal memory usage. For example, if MILC loaded the eigenvectors and then passed them to QUDA, then the host memory usage would be doubled, and this is not a feasible approach given the size of eigenvectors. This way, MILC only passes the filenames back and forth to QUDA. If e.g. one wants to use non-QUDA eigenvectors with the QUDA deflated solver, then one would need a separate utility to convert the file to QUDA-readable format, save it to disk, and then run ks_spectrum with the QUDA solver.
The QUDA deflation should work fine for varying masses. If different quark masses are used for different propagators, the eigenvectors remain the same, but the eigenvalues need to be updated since they depend on the quark mass by $+4m^2$. This is taken care of automatically. The eigenvalues are preserved unless the quark mass changes, and then they are automatically recalculated.
In a real-world application to compute many correlators, the job is typically chunked into readin sets. The gauge field is loaded for the first readin set and then "continue" is used for subsequent readin sets. With QUDA's ability to preserve the deflation space, the eigenvectors are handled in a similar manner. For the first readin set, the eigenvectors are either read in or generated. For subsequent readin sets, one must still include the parameters for reloading or generating the eigenvectors, however, these are ignored because QUDA will just continue with the initial set of eigenvectors. Thus, for multiple readin sets one does not have to worry that unnecessary time is spent reloading the eigenvectors and recomputing the eigenvalues. This also means that one cannot change eigenvector sets during a run. This behavior could be modified by changing qep.preserve_deflation_space if desired, but I don't think it's necessary. If one wants to switch to different eigenvectors, one might as well do a separate run.
One can adjust the eigensolver precision from the input parameter file. Typically, single precision should be fine. However, with such "sloppy" eigenvectors, it is important that the deflation is repeated periodically during the CG solve. This is controlled by the tol_restart parameter.
When eigenvectors are saved to disk, they are saved in single precision. This could be modified easily, but there should be no need to save them in double precision since single precision eigenvectors are fine provided that tol_restart is reasonable.
Note that QUDA's block TRLM does not seem to be working well yet, so leave block_size at 1.
In general, when using QUDA deflation, the ks_spectrum application will need an input file with parameters like:
max_number_of_eigenpairs 512 # How many eigenvectors to use for deflation
tol_restart 1e-2 # How often to do the redeflation
When loading eigenvectors from file, use a parameter block like:
reload_parallel_ks_eigen [filename] # This works with both single file and partfile formats
file_number_of_eigenpairs 512 # In case the file has more eigenvectors than will be used to deflate
forget_ks_eigen # Don't save the eigenvectors to file
Alternatively, when generating fresh eigenvectors, use a parameter block like:
fresh_ks_eigen # Run QUDA's eigensolver
save_partfile_ks_eigen [filename] # Use save_parallel_ks_eigen for single file format or forget_ks_eigen to discard
Max_Lanczos_restart_iters 1000 # Max number of Lanczos restart iterations
eigenval_tolerance 1e-12 # Eigenvalue tolerance
Lanczos_max 1024 # Size of Krylov space, corresponds to QUDA's n_kr
Lanczos_restart 1000 # Deprecated, does nothing as far as I can tell
eigensolver_prec 1 # Precision in eigensolver, double=2, single=1, half=0
batched_rotate 20 # Size of batch_rotate
Chebyshev_alpha 0.1 # Must be larger than 4*m^2 for largest quark mass that will be deflated
Chebyshev_beta 0 # Leave at 0 for QUDA to estimate internally
Chebyshev_order 100 # Chebyshev order
block_size 1 # block_size>1 implies block TRLM (doesn't work well yet?)
Also, don't forget to set
deflate yes/no
in the propagator stanzas.
…________________________________
You can view, comment on, or merge this pull request online at:
#76
Commit Summary
* 10c485f<10c485f> Added QUDA deflation for UML, CG, and CGZ for single right-hand side
* edcfd92<edcfd92> Fixed fresh and save options for QUDA eigenvectors
* 2535fbf<2535fbf> Updated some interfacing with quda
* eb301ba<eb301ba> Fixed deflate savebuf was overwriting mass savebuf
* 7a6d501<7a6d501> QUDA batched deflation for multiple right hand sides
* 33e58df<33e58df> Added to input parameters
File Changes
(7 files<https://github.com/milc-qcd/milc_qcd/pull/76/files>)
* M generic/milc_to_quda_utilities.c<https://github.com/milc-qcd/milc_qcd/pull/76/files#diff-6a425785ad7f7a499de6c6a88aefe903bbd4ddb0d95ab9c8d308bab9738e435d> (1)
* M generic_ks/d_congrad5_fn_quda.c<https://github.com/milc-qcd/milc_qcd/pull/76/files#diff-a0aa8c9a7c9d9e552d38102463c66fe8d4db9ef134390a59ec66ac971929f29e> (100)
* M generic_ks/mat_invert.c<https://github.com/milc-qcd/milc_qcd/pull/76/files#diff-a288d3bfe1fd74cfe51f0764a3d8c0ff8f126ad82d61e214a2409cbd6c5be7ad> (42)
* M generic_ks/read_eigen_param.c<https://github.com/milc-qcd/milc_qcd/pull/76/files#diff-7f345cf28d0f68b1610f26929aef8c80f5c0070439c414f612536e7dcc26255a> (9)
* M include/imp_ferm_links.h<https://github.com/milc-qcd/milc_qcd/pull/76/files#diff-cca4d8c380d4fa07971683084ee90f2904f86543c8c1da246251d68e6e696e3a> (7)
* M ks_spectrum/control.c<https://github.com/milc-qcd/milc_qcd/pull/76/files#diff-8998b9d0ad49ef860f0c7e479be6e309d70040c3ce0f216d44f6f3882cbcf30f> (24)
* M ks_spectrum/setup.c<https://github.com/milc-qcd/milc_qcd/pull/76/files#diff-f1f896a9576ba5c133487058556db8c8d86d321e66a9244619ae910eb4c11ffd> (30)
Patch Links:
* https://github.com/milc-qcd/milc_qcd/pull/76.patch
* https://github.com/milc-qcd/milc_qcd/pull/76.diff
—
Reply to this email directly, view it on GitHub<#76>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABGG3BLMZICUP6JSWTHYTZL2G3FSDAVCNFSM6AAAAABUBWWUIWVHI2DSMVQWIX3LMV43ASLTON2WKOZSG42TINRZGA2TCNQ>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request implements QUDA deflation for
ks_spectrum
with support for multiple right-hand sides. Previously, MILC deflation was done on CPU.Key points:
ks_spectrum
must be compiled withWANT_QUDA
,WANT_FN_CG_GPU
, andWANT_EIG_GPU
all truefresh_ks_eigen
withks_spectrum
will trigger QUDA's eigensolve internally. MILC's eigensolve functions are bypassedleonhostetler/milc_batched_deflation
branch ofhttps://github.com/leonhostetler/quda.git
More details:
Using
ks_spectrum
withfresh_ks_eigen
is now working. So there is no longer a need to do a two-part process where the eigenvectors are generated using QUDA's standalone eigensolver and then using MILC's ks_spectrum to load the eigenvectors and do the deflation. Theks_spectrum
application can now handle both the eigensolve and CG solves in the same run.This is implemented for UML, CG, and CGZ, however, deflation only occurs for the even parity solves. For UML, where the odd parity solve is just a polishing of the odd solution reconstructed from the even solution, this works well since the odd solve typically requires many fewer iterations. However, for CG and CGZ, this means that only the even half of the problem will be sped up by deflation. If there is a need for odd parity deflation, we'll need to think about how best to implement that in the future.
Note that eigenvector files are loaded and saved from within QUDA--not MILC. This was both the simplest way to interface with the QUDA solver and the way that ensured minimal memory usage. For example, if MILC loaded the eigenvectors and then passed them to QUDA, then the host memory usage would be doubled, and this is not a feasible approach given the size of eigenvectors. This way, MILC only passes the filenames back and forth to QUDA. If e.g. one wants to use non-QUDA eigenvectors with the QUDA deflated solver, then one would need a separate utility to convert the file to QUDA-readable format, save it to disk, and then run
ks_spectrum
with the QUDA solver.The QUDA deflation should work fine for varying masses. If different quark masses are used for different propagators, the eigenvectors remain the same, but the eigenvalues need to be updated since they depend on the quark mass by$+4m^2$ . This is taken care of automatically. The eigenvalues are preserved unless the quark mass changes, and then they are automatically recalculated.
In a real-world application to compute many correlators, the job is typically chunked into readin sets. The gauge field is loaded for the first readin set and then "continue" is used for subsequent readin sets. With QUDA's ability to preserve the deflation space, the eigenvectors are handled in a similar manner. For the first readin set, the eigenvectors are either read in or generated. For subsequent readin sets, one must still include the parameters for reloading or generating the eigenvectors, however, these are ignored because QUDA will just continue with the initial set of eigenvectors. Thus, for multiple readin sets one does not have to worry that unnecessary time is spent reloading the eigenvectors and recomputing the eigenvalues. This also means that one cannot change eigenvector sets during a run. This behavior could be modified by changing
qep.preserve_deflation_space
if desired, but I don't think it's necessary. If one wants to switch to different eigenvectors, one might as well do a separate run.One can adjust the eigensolver precision from the input parameter file. Typically, single precision should be fine. However, with such "sloppy" eigenvectors, it is important that the deflation is repeated periodically during the CG solve. This is controlled by the
tol_restart
parameter.When eigenvectors are saved to disk, they are saved in single precision. This could be modified easily, but there should be no need to save them in double precision since single precision eigenvectors are fine provided that
tol_restart
is reasonable.Note that QUDA's block TRLM does not seem to be working well yet, so leave block_size at 1.
In general, when using QUDA deflation, the ks_spectrum application will need an input file with parameters like:
When loading eigenvectors from file, use a parameter block like:
Alternatively, when generating fresh eigenvectors, use a parameter block like:
Also, don't forget to set
in the propagator stanzas.