Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Description: Relative work in Paper #13

Open
zwxdxcm opened this issue Nov 3, 2023 · 10 comments
Open

Description: Relative work in Paper #13

zwxdxcm opened this issue Nov 3, 2023 · 10 comments

Comments

@zwxdxcm
Copy link

zwxdxcm commented Nov 3, 2023

image

I am wondering if this description is correct? Why there is not QAT in blue highlight?

Thanks!

@zwxdxcm
Copy link
Author

zwxdxcm commented Nov 3, 2023

I mean ... If you wanna train neural field from scratch, it is reasonable to implementing E2E compression, which should use QAT. For pre-trained model, it would be better to use PTQ.

@daniel03c1
Copy link
Owner

Sorry for the confusion. You are right, and we were incorrectly written. Thank you.

@zwxdxcm
Copy link
Author

zwxdxcm commented Nov 3, 2023

Sorry for the confusion. You are right, and we were incorrectly written. Thank you.

Thanks for your reply. Double check, so this work implement QAT since it trains network from scratch. Have you compare how many time spent on the additional computations during both training and inference?

@zwxdxcm
Copy link
Author

zwxdxcm commented Nov 3, 2023

And I also have another 2 questions.

  1. This work's target is to reduce memory(running time storage) instead of model storage, right? So the depicted in experiments part is memory?
  2. I am wondering how to design the mask process? Is there any relative work? cause I cannot understand it LOL.
image image

Thank you !

@zwxdxcm
Copy link
Author

zwxdxcm commented Nov 3, 2023

Oh i understand it seems like a threshold function. but why would you use stop gradient operator?

@daniel03c1
Copy link
Owner

Thank you for your interest in our work.

  1. Regarding the time, it only increases the training time, and the exact training times are described in the supplementary. During inference, wavelet coefficients only need to be converted once, which takes a constant time to depack sparse representations into spatial grids, and thus the inference time is equal to the original model without mask or wavelet transformation.
  2. The work only considers storage memory. Regarding the memory, it might need extra memory during training, but during inference, it will require as much as the original model requires.
  3. I believe this issue is related to About detach() function #10. If you have further questions, please feel free to ask.

@zwxdxcm
Copy link
Author

zwxdxcm commented Nov 3, 2023

OK. Thanks ~~

@zwxdxcm
Copy link
Author

zwxdxcm commented Nov 3, 2023

For what I understood, the masking part is more like a learnable frequency filter, am I correct?

@daniel03c1
Copy link
Owner

It is not necessarily a frequency filter. The masking method itself can filter whatever you want. For example, if you apply the masking method to spatial grids, it filters spacial coefficients. If you apply it to frequency grids (after DCT), you get frequency filters.

@zwxdxcm
Copy link
Author

zwxdxcm commented Nov 6, 2023

Thanks for your reply !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants