Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on calculating square root of the FIM #85

Closed
vimmoos opened this issue Sep 30, 2024 · 4 comments
Closed

Question on calculating square root of the FIM #85

vimmoos opened this issue Sep 30, 2024 · 4 comments

Comments

@vimmoos
Copy link

vimmoos commented Sep 30, 2024

I am currently writing a method to perform forgetting in RL. To do so I add to the model some noise based on the FIM.
A snippet for such an approach is:

 def calculate_scrub_noise(self):
        gaus_noise = obj.vector.random_pvector(
            lc.LayerCollection.from_model(self.model), device=self.device
        )
        inv = self.FIM.inverse().mv(gaus_noise)
        return self.scrub_scale * inv

However, I would like to take the square root of the $F^{-1}*noise$ .
So i would like to implement the following formula: $(scale*F^{-1}*noise)^{1/2}$
What would be the best way to implement it within your library?

Side Note: Great library! compared to other options I found your library to be the most mathematically accurate for the FIM !

Thanks for the response

@tfjgeorge
Copy link
Owner

tfjgeorge commented Sep 30, 2024 via email

@vimmoos
Copy link
Author

vimmoos commented Sep 30, 2024

Thanks for the fast reply!
I am trying to apply the method described in Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks in the RL context. Section 4 of that paper describes the final formulation. I want to take the square root of the inverse of the FIM and not it's individual component (i.e. $X*X=F^{-1}$ where F is the FIM). My initial idea was to get the flat representation from the PVector, perform the square root using Pytorch (as described here: Pytorch sqrt) and then convert it back to a PVector so that I can add it to the model (using the add_model method) In the Pytorch forum linked above they suggest to treat negative eigenvalues as 0.
That said, I use the KFAC representation for the FIM so maybe there is a better/ more efficient / more idiomatic way to do it within your library.
Moreover, I am not sure what would mean geometrically to take the square root of the components besides simply scaling them. Do you have more insights on this ? And do you think that taking the square root of the components would have a similar effect compared to taking the square root of the matrix?
I hope this answer questions, if not please let me know

Best

@tfjgeorge
Copy link
Owner

This PR #87 should work using EKFAC matrices.

sqrt_inv_M = M_ekfac ** -0.5

In the case of KFAC, I am not sure what to implement since KFAC was only defined for computing the inverse in the paper. There is some weird side effect of the effect of the diagonal regularizer added before computing the inverse in KFAC. I also added a self.pow(n) method, but I don't think it works with non-integer n so this should not be relevant to your use case. I don't know what they actually do in the paper that you link.

@vimmoos
Copy link
Author

vimmoos commented Sep 30, 2024

I made an error in the previous message, I do indeed use the EKFAC, not the KFAC. Sorry for the inconvenience.
Thank you for the new PR. This looks like exactly what i was looking for.

@vimmoos vimmoos closed this as completed Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants