Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collection of minor performance fixes during profiling and GPU testing #181

Merged
merged 7 commits into from
Feb 7, 2024

Conversation

sebastiangrimberg
Copy link
Contributor

  • Only parallelize libCEED across OpenMP threads when CPU backends are used
  • Add some missing AddMult/AddMultTranspose overrides and avoid calling them when they don't exist to avoid temporary vectors
  • Avoid constructing discrete gradient matrix on coarse mesh unless necessary for coarse solve (AMS)

@sebastiangrimberg sebastiangrimberg added the performance Related to performance label Feb 2, 2024
@sebastiangrimberg sebastiangrimberg force-pushed the sjg/libceed-gpu-dev branch 2 times, most recently from b399b29 to d633677 Compare February 2, 2024 02:19
@sebastiangrimberg sebastiangrimberg force-pushed the sjg/libceed-qfunction-source branch 2 times, most recently from 6a97804 to 538e8d1 Compare February 5, 2024 16:41
Base automatically changed from sjg/libceed-qfunction-source to main February 5, 2024 17:26
@sebastiangrimberg sebastiangrimberg merged commit 7d26775 into main Feb 7, 2024
17 checks passed
@sebastiangrimberg sebastiangrimberg deleted the sjg/libceed-gpu-dev branch February 7, 2024 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Related to performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants