You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You have done a great job!! I have been training two different models. the one mentioned in the title ("ixa-ehu/ixambert-base-cased") and multibert_cased. With the multibert I didn't have any problems with the training, however, when I try to train the other model it says that I have a missmatch with the shape of the vocabulary size.
In the config file of the "ixa-ehu/ixambert-base-cased" model the vocabulary size is the following one:
08/18/2022 09:41:28 - INFO - awesome_align.configuration_utils - Model config BertConfig {
"architectures": null,
"attention_probs_dropout_prob": 0.1,
"bos_token_id": null,
"do_sample": false,
"eos_token_ids": null,
"finetuning_task": null,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"id2label": {
"0": "LABEL_0",
"1": "LABEL_1"
},
"initializer_range": 0.02,
"intermediate_size": 3072,
"is_decoder": false,
"label2id": {
"LABEL_0": 0,
"LABEL_1": 1
},
"layer_norm_eps": 1e-12,
"length_penalty": 1.0,
"max_length": 20,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
"num_beams": 1,
"num_hidden_layers": 12,
"num_labels": 2,
"num_return_sequences": 1,
"output_attentions": false,
"output_hidden_states": false,
"output_past": true,
"pad_token_id": null,
"repetition_penalty": 1.0,
"temperature": 1.0,
"top_k": 50,
"top_p": 1.0,
"torchscript": false,
"type_vocab_size": 2,
"use_bfloat16": false,
"vocab_size": 119099
}
When I begin with the training i get this error:
Iteration: 0%| | 0/40000 [00:00<?, ?it/s]Traceback (most recent call last):
File "/mnt/datuak/virtualenvs/transformers/bin/awesome-train", line 8, in
sys.exit(main())
File "/mnt/datuak/virtualenvs/transformers/lib/python3.6/site-packages/awesome_align/run_train.py", line 848, in main
global_step, tr_loss = train(args, train_dataset, model, tokenizer)
File "/mnt/datuak/virtualenvs/transformers/lib/python3.6/site-packages/awesome_align/run_train.py", line 370, in train
loss = model(inputs_src=inputs_src, labels_src=labels_src)
File "/mnt/datuak/virtualenvs/transformers/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/datuak/virtualenvs/transformers/lib/python3.6/site-packages/awesome_align/modeling.py", line 660, in forward
masked_lm_loss = loss_fct(prediction_scores_src.view(-1, self.config.vocab_size), labels_src.view(-1))
RuntimeError: shape '[-1, 119101]' is invalid for input of size 5716752
As you can see the vocab_size has increased by 2 from 119099 to 119101. This is due to the CLS and SEP tokens, however, I don't know why I get this error. I have tried to manually decrease the vocab_size in the code, but this leads to some other errors when I make the alignments.
I leave you here the awesome-train command I have used for training:
CUDA_VISIBLE_DEVICES=1 awesome-train
--output_dir=$OUTPUT_DIR
--model_name_or_path=ixa-ehu/ixambert-base-cased
--extraction 'softmax'
--do_train
--train_mlm
--train_tlm
--train_tlm_full
--train_so
--train_psi
--train_co
--train_data_file=$TRAIN_FILE
--per_gpu_train_batch_size 2
--gradient_accumulation_steps 4
--num_train_epochs 1
--learning_rate 2e-5
--save_steps 10000
--max_steps 40000 \
Could you please help me solve this issue?
Thanks!
The text was updated successfully, but these errors were encountered:
Hi!
You have done a great job!! I have been training two different models. the one mentioned in the title ("ixa-ehu/ixambert-base-cased") and multibert_cased. With the multibert I didn't have any problems with the training, however, when I try to train the other model it says that I have a missmatch with the shape of the vocabulary size.
In the config file of the "ixa-ehu/ixambert-base-cased" model the vocabulary size is the following one:
08/18/2022 09:41:28 - INFO - awesome_align.configuration_utils - Model config BertConfig {
"architectures": null,
"attention_probs_dropout_prob": 0.1,
"bos_token_id": null,
"do_sample": false,
"eos_token_ids": null,
"finetuning_task": null,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"id2label": {
"0": "LABEL_0",
"1": "LABEL_1"
},
"initializer_range": 0.02,
"intermediate_size": 3072,
"is_decoder": false,
"label2id": {
"LABEL_0": 0,
"LABEL_1": 1
},
"layer_norm_eps": 1e-12,
"length_penalty": 1.0,
"max_length": 20,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
"num_beams": 1,
"num_hidden_layers": 12,
"num_labels": 2,
"num_return_sequences": 1,
"output_attentions": false,
"output_hidden_states": false,
"output_past": true,
"pad_token_id": null,
"repetition_penalty": 1.0,
"temperature": 1.0,
"top_k": 50,
"top_p": 1.0,
"torchscript": false,
"type_vocab_size": 2,
"use_bfloat16": false,
"vocab_size": 119099
}
When I begin with the training i get this error:
Iteration: 0%| | 0/40000 [00:00<?, ?it/s]Traceback (most recent call last):
File "/mnt/datuak/virtualenvs/transformers/bin/awesome-train", line 8, in
sys.exit(main())
File "/mnt/datuak/virtualenvs/transformers/lib/python3.6/site-packages/awesome_align/run_train.py", line 848, in main
global_step, tr_loss = train(args, train_dataset, model, tokenizer)
File "/mnt/datuak/virtualenvs/transformers/lib/python3.6/site-packages/awesome_align/run_train.py", line 370, in train
loss = model(inputs_src=inputs_src, labels_src=labels_src)
File "/mnt/datuak/virtualenvs/transformers/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/datuak/virtualenvs/transformers/lib/python3.6/site-packages/awesome_align/modeling.py", line 660, in forward
masked_lm_loss = loss_fct(prediction_scores_src.view(-1, self.config.vocab_size), labels_src.view(-1))
RuntimeError: shape '[-1, 119101]' is invalid for input of size 5716752
As you can see the vocab_size has increased by 2 from 119099 to 119101. This is due to the CLS and SEP tokens, however, I don't know why I get this error. I have tried to manually decrease the vocab_size in the code, but this leads to some other errors when I make the alignments.
I leave you here the awesome-train command I have used for training:
CUDA_VISIBLE_DEVICES=1 awesome-train
--output_dir=$OUTPUT_DIR
--model_name_or_path=ixa-ehu/ixambert-base-cased
--extraction 'softmax'
--do_train
--train_mlm
--train_tlm
--train_tlm_full
--train_so
--train_psi
--train_co
--train_data_file=$TRAIN_FILE
--per_gpu_train_batch_size 2
--gradient_accumulation_steps 4
--num_train_epochs 1
--learning_rate 2e-5
--save_steps 10000
--max_steps 40000 \
Could you please help me solve this issue?
Thanks!
The text was updated successfully, but these errors were encountered: