-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about computation reduction #3
Comments
@Planck35 By computation do you mean the amount of computation (i.e. number of floating-point operations)? If so, then no, the amount of computation would be roughly the same after pruning but should not increase. |
@larry0123du Hi,I used the code in my model,but the pruning model size is perfectly equal to the model size before pruning. What is the reason for the phenomenon? |
The reason is that the weights are simply truncated to zero but zero is still represented as a floating point number. So in essence, as long as the size of matrices is unchanged, your model would not change in size. In the original paper Han et al supplemented with a Huffman encoding scheme which would boost the performance if I remembered right. |
@larry0123du ok ,thanks! |
@larry0123du hello,the weights are simply truncated to zero,will the inference speed increase? |
You did a very nice implement
But I want to ask for the weight that got masked by zero in weights.
Did the whole computation increase but weight's value are zero?
or the computation speed is just normal?
The text was updated successfully, but these errors were encountered: