loading model error: RuntimeError: Error(s) in loading state_dict for GPT #26

TuTuSklmq · 2024-01-09T12:07:45Z

In this part, I have encountered some problems
`model = GPT(mconf)
model.load_state_dict(torch.load(args.model_weight, map_location='cpu'), False)
#model.to('cpu')
print('Model loaded')’

When I try to run this section, this problem occurs

RuntimeError: Error(s) in loading state_dict for GPT: size mismatch for pos_emb: copying a param with shape torch.Size([1, 40, 256]) from checkpoint, the shape in current model is torch.Size([1, 54, 256]). size mismatch for tok_emb.weight: copying a param with shape torch.Size([94, 256]) from checkpoint, the shape in current model is torch.Size([26, 256]). size mismatch for blocks.0.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.1.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.2.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.3.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.4.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.5.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.6.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.7.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for head.weight: copying a param with shape torch.Size([94, 256]) from checkpoint, the shape in current model is torch.Size([26, 256]).

How should this problem be solved? I would greatly appreciate it if someone could help me

The text was updated successfully, but these errors were encountered:

cmbscnm · 2024-02-14T05:18:46Z

May I ask if you solved this problem, I also encountered this

amoldwin · 2024-07-25T21:39:23Z

Hi! This occurs when running the train scripts to reproduce the models and then attempting to generate molecules using generate.sh.

I have found that this is due to the scaffold_max_len parameter. The way that train.py and model.py are currently written, this affects the size of the attention mask regardless of whether a scaffold condition is being used. So this can either be removed in the train script, or you can manually set scaffold_max_len to 100 in the generate.py script.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

loading model error: RuntimeError: Error(s) in loading state_dict for GPT #26

loading model error: RuntimeError: Error(s) in loading state_dict for GPT #26

TuTuSklmq commented Jan 9, 2024

cmbscnm commented Feb 14, 2024

amoldwin commented Jul 25, 2024

loading model error: RuntimeError: Error(s) in loading state_dict for GPT #26

loading model error: RuntimeError: Error(s) in loading state_dict for GPT #26

Comments

TuTuSklmq commented Jan 9, 2024

cmbscnm commented Feb 14, 2024

amoldwin commented Jul 25, 2024