Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not obtain the lm_head.weight #242

Open
zhiyuanyou opened this issue Sep 8, 2024 · 1 comment
Open

Can not obtain the lm_head.weight #242

zhiyuanyou opened this issue Sep 8, 2024 · 1 comment

Comments

@zhiyuanyou
Copy link

zhiyuanyou commented Sep 8, 2024

Hello,

Thanks for your wonderful work. I am doing some testing with your code. However, I found a very strange problem.

I want to print the weight shape of lm_head (https://github.com/X-PLUG/mPLUG-Owl/blob/main/mPLUG-Owl2/mplug_owl2/model/modeling_mplug_owl2.py#L220) with the following codes.

        print("Before initializing lm_head: ", config.hidden_size, config.vocab_size)
        self.lm_head = nn.Linear(config.hidden_size, config.vocab_size, bias=False)
        print("After initializing lm_head: ", config.hidden_size, config.vocab_size)
        print("weight shape: ", self.lm_head.weight.shape)

The results are:

Before initializing lm_head:  4096 32000
After initializing lm_head:  4096 32000 
weight shape:  torch.Size([0])

I just very confused why the output of lm_head.weight.shape is 0. I wonder whether you have some insights about this problem.

Monitoring this parameter is very important for me. However, I just can not obtain such a parameter during training.

Thanks.

@LukeForeverYoung
Copy link
Collaborator

Are you using the zero-3 strategy to initialize the model? If so, the parameters may be offloaded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants