Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the FormSum memory leak #3897

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

Ig-dolci
Copy link
Contributor

@Ig-dolci Ig-dolci commented Nov 28, 2024

Description

We are having memory leaks and more expensive computation of solvers involving FormSum (See more details on this discussion). That is caused by this operation. My solution here is to make a sum operation through the numpy arrays. Below is a comparison (using the example added at the discussion) of time and memory computation differences between the master branch and the current PR.

result

Copy link

github-actions bot commented Nov 28, 2024

TestsPassed ✅Skipped ⏭️Failed ❌
Firedrake complex8125 ran6540 passed1585 skipped0 failed

Copy link

github-actions bot commented Nov 28, 2024

TestsPassed ✅Skipped ⏭️Failed ❌
Firedrake real8131 ran7345 passed786 skipped0 failed

@JHopeCollins
Copy link
Member

Oh dear, if this fixes the issue then that interface could do with updating!

I think that the traversal method is taking advantage of the behaviour described here: #3348 (comment)

This behaviour is not widely known and is quite unintuitive - it's usually considered a bug. It might be best to change this method signature (and the preorder traversal method) to use visited=None as the kwarg, and use a self._visited dict with visited = visited if visited else self._visited

@Ig-dolci
Copy link
Contributor Author

Oh dear, if this fixes the issue then that interface could do with updating!

I think that the traversal method is taking advantage of the behaviour described here: #3348 (comment)

This behaviour is not widely known and is quite unintuitive - it's usually considered a bug. It might be best to change this method signature (and the preorder traversal method) to use visited=None as the kwarg, and use a self._visited dict with visited = visited if visited else self._visited

No. This does not fix it. I am still debugging.

@JHopeCollins
Copy link
Member

No. This does not fix it. I am still debugging.

Shame it wasn't so simple! It may still be good to change the signatures to avoid that behaviour though

@Ig-dolci
Copy link
Contributor Author

Oh dear, if this fixes the issue then that interface could do with updating!

I think that the traversal method is taking advantage of the behaviour described here: #3348 (comment)

This behaviour is not widely known and is quite unintuitive - it's usually considered a bug. It might be best to change this method signature (and the preorder traversal method) to use visited=None as the kwarg, and use a self._visited dict with visited = visited if visited else self._visited

I see. I gonna try visited=None.

@@ -584,7 +583,8 @@ def update_tensor(assembled_base_form, tensor):
raise NotImplementedError("Cannot update tensor of type %s" % type(tensor))

@staticmethod
def base_form_postorder_traversal(expr, visitor, visited={}):
def base_form_postorder_traversal(expr, visitor, visited=None):
visited = visited if visited is not None else {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To have the same caching behaviour as before, without the {} default kwarg, this should stash visited as an attribute so it get's reused unless the user passes visited, e.g.

visited = visited if visited is not None else self._visited

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to keep the original code here.

@Ig-dolci Ig-dolci changed the title Fix the FormSum!? Fix the FormSum memory leak Dec 2, 2024
Comment on lines +477 to +479
dat_result.data_ro_with_halos,
w * dat_op.data_ro_with_halos,
out=dat_result.data_wo_with_halos)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if make this operation with_halos is the right think to do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does result.assign(sum(w*arg for arg in args)) work? This code looks very very similar to what we do in assign.py.

Copy link
Contributor Author

@Ig-dolci Ig-dolci Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. result is a Cofunction, and Cofunction assigning an exp that is isinstance(expr, BaseForm) will reach this code again, which leads to a maximum recursion. See this Cofunction assignment code.

@Ig-dolci Ig-dolci marked this pull request as ready for review December 2, 2024 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants