finetune on GLUE task ends up with same probabality #332

TingchenFu · 2021-04-05T03:24:26Z

Hi guys,
first of all Thanks to your great model! I finetuned the pretrained model named mlm_tlm_xnli15_1024.pth for MNLI-m task(to be specific, two class classification task). although seting the hyperparams as recommended:
python glue-xnli.py --exp_name test_xnli_mlm_tlm # experiment name --dump_path ./dumped/ # where to store the experiment --model_path mlm_tlm_xnli15_1024.pth # model location --data_path ./data/processed/XLM15 # data location --transfer_tasks XNLI,SST-2 # transfer tasks (XNLI or GLUE tasks) --optimizer_e adam,lr=0.000025 # optimizer of projection (lr \in [0.000005, 0.000025, 0.000125]) --optimizer_p adam,lr=0.000025 # optimizer of projection (lr \in [0.000005, 0.000025, 0.000125]) --finetune_layers "0:_1" # fine-tune all layers --batch_size 8 # batch size (\in [4, 8]) --n_epochs 250 # number of epochs --epoch_size 20000 # number of sentences per epoch --max_len 256 # max number of words in sentences --max_vocab 95000 # max number of words in vocab
after several epochs I got EXACTLY same propabality output for all the valid cases:
-0.27187905 -0.27174124 -0.27346167 -0.27336964 -0.27150354 -0.27345833 -0.2712339 -0.2730249 -0.2720655 -0.2718483
the number is the probabality of being classified as positive case given by the model.
could any one tell me what happened and is there any possible solution for that?

The text was updated successfully, but these errors were encountered:

TingchenFu · 2021-04-05T03:30:56Z

I found that after the first embedding layer in TransformerModel.fwd

tensor = self.embeddings(x)

tensor is same for all the different cases. self.embedding is defined as :

self.embeddings = Embedding(self.n_words, self.dim, padding_idx=self.pad_index)

where self.n_words=95000 and self.dim=1024 as in the pretrained_params. Is there anything wrong?

TingchenFu · 2021-04-05T04:01:55Z

the train log is here:https://paste.ubuntu.com/p/SbDw33JPjN/
and the complete probability result of valid dataset is here: https://paste.ubuntu.com/p/xXD9FfGdcT/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

finetune on GLUE task ends up with same probabality #332

finetune on GLUE task ends up with same probabality #332

TingchenFu commented Apr 5, 2021

TingchenFu commented Apr 5, 2021

TingchenFu commented Apr 5, 2021

finetune on GLUE task ends up with same probabality #332

finetune on GLUE task ends up with same probabality #332

Comments

TingchenFu commented Apr 5, 2021

TingchenFu commented Apr 5, 2021

TingchenFu commented Apr 5, 2021