-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove detaches to enable gradient-based force EquiformerV2 models. #779
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you run validation on Val-ID-30k using the pre-trained 31M checkpoint and share the results here. Would be good to make sure things are consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
Agreed with @mshuaibii that making sure direct force models are consistent. In addition it might be worth to write a simple unit-test to make sure the model is indeed properly differentiable.
output = torch.zeros_like(edge_rot_mat) | ||
mask = (yprod > -0.9999) & (yprod < 0.9999) | ||
output[mask] = edge_rot_mat[mask] | ||
output[~mask, 0, :] = edge_rot_mat[~mask, 0, :] | ||
output[~mask, 2, :] = edge_rot_mat[~mask, 2, :] | ||
output[yprod > 0.9999, 1, :] = edge_rot_mat.new_tensor([[0., 1., 0.]]) | ||
output[yprod < -0.9999, 1, :] = edge_rot_mat.new_tensor([[0., -1., 0.]]) | ||
|
||
return output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, can this section be replaced with?
edge_rot_mat[yprod > 0.9999, 1, :] = edge_rot_mat.new_tensor([[0., 1., 0.]])
edge_rot_mat[yprod < -0.9999, 1, :] = edge_rot_mat.new_tensor([[0., -1., 0.]])
return edge_rot_mat
This PR has been marked as stale because it has been open for 30 days with no activity. |
Egrad models will be in dev-mode for now. |
Remove the detaches in obtaining rotation matrices and wignerD matrices for correct gradients of model outputs with regard to atomic positions.
Naively remove the detaches result in instability due to gradient explosion for y-axis rotation of ~0 or ~180 degree.
In this fix, we instead do not rotate / rotate 180 degree for edges that are already aligned with the y-axis. This fix enables correct gradients of model outputs with regard to the input atom positions and thus enable training energy-conserving gradient-based force EquiformerV2 models.