You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this part, I have encountered some problems
`model = GPT(mconf)
model.load_state_dict(torch.load(args.model_weight, map_location='cpu'), False)
#model.to('cpu')
print('Model loaded')’
When I try to run this section, this problem occurs
RuntimeError: Error(s) in loading state_dict for GPT: size mismatch for pos_emb: copying a param with shape torch.Size([1, 40, 256]) from checkpoint, the shape in current model is torch.Size([1, 54, 256]). size mismatch for tok_emb.weight: copying a param with shape torch.Size([94, 256]) from checkpoint, the shape in current model is torch.Size([26, 256]). size mismatch for blocks.0.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.1.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.2.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.3.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.4.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.5.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.6.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.7.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for head.weight: copying a param with shape torch.Size([94, 256]) from checkpoint, the shape in current model is torch.Size([26, 256]).
How should this problem be solved? I would greatly appreciate it if someone could help me
The text was updated successfully, but these errors were encountered:
Hi! This occurs when running the train scripts to reproduce the models and then attempting to generate molecules using generate.sh.
I have found that this is due to the scaffold_max_len parameter. The way that train.py and model.py are currently written, this affects the size of the attention mask regardless of whether a scaffold condition is being used. So this can either be removed in the train script, or you can manually set scaffold_max_len to 100 in the generate.py script.
RuntimeError: Error(s) in loading state_dict for GPT: size mismatch for pos_emb: copying a param with shape torch.Size([1, 40, 256]) from checkpoint, the shape in current model is torch.Size([1, 54, 256]). size mismatch for tok_emb.weight: copying a param with shape torch.Size([94, 256]) from checkpoint, the shape in current model is torch.Size([26, 256]). size mismatch for blocks.0.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.1.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.2.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.3.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.4.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.5.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.6.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.7.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for head.weight: copying a param with shape torch.Size([94, 256]) from checkpoint, the shape in current model is torch.Size([26, 256]).
How should this problem be solved? I would greatly appreciate it if someone could help me
The text was updated successfully, but these errors were encountered: