Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev loss does not drop when finetuning #8

Open
yananchen1989 opened this issue Jul 9, 2021 · 1 comment
Open

Dev loss does not drop when finetuning #8

yananchen1989 opened this issue Jul 9, 2021 · 1 comment

Comments

@yananchen1989
Copy link

hello, I am reusing the code for finetuning the model , such as CBERT. However, the dev loss does not drop while the train loss on train set seems normal. I wonder if this is OK?

here is the fine tuning log:

07/09/2021 11:30:11 - INFO - main - ***** Running training *****
07/09/2021 11:30:11 - INFO - main - Num examples = 200
07/09/2021 11:30:11 - INFO - main - Batch size = 8
07/09/2021 11:30:11 - INFO - main - Num steps = 250
Epoch: 0%| | 0/10 [00:00<?, ?it/s]/root/yanan/env_cbert/lib/python3.6/site-packages/transformers/optimization.py:155: UserWarning: This overload of add_ is deprecated:
add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
add_(Tensor other, *, Number alpha) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
exp_avg.mul_(beta1).add_(1.0 - beta1, grad)
07/09/2021 11:30:15 - INFO - main - Epoch 0, Dev loss 68.01114630699158
07/09/2021 11:30:15 - INFO - main - Epoch 0, Train loss 30.59469723701477
07/09/2021 11:30:15 - INFO - main - Saving model. Best dev so far 68.01114630699158
Epoch: 10%|█████ | 1/10 [00:06<00:54, 6.06s/it]07/09/2021 11:30:20 - INFO - main - Epoch 1, Dev loss 71.50074660778046
07/09/2021 11:30:20 - INFO - main - Epoch 1, Train loss 7.842194274067879
Epoch: 20%|██████████ | 2/10 [00:09<00:42, 5.27s/it]07/09/2021 11:30:24 - INFO - main - Epoch 2, Dev loss 74.61862516403198
07/09/2021 11:30:24 - INFO - main - Epoch 2, Train loss 1.4141736282035708
Epoch: 30%|███████████████ | 3/10 [00:12<00:33, 4.72s/it]07/09/2021 11:30:27 - INFO - main - Epoch 3, Dev loss 75.86035788059235
07/09/2021 11:30:27 - INFO - main - Epoch 3, Train loss 0.6085959081538022
Epoch: 40%|████████████████████ | 4/10 [00:16<00:26, 4.35s/it]07/09/2021 11:30:31 - INFO - main - Epoch 4, Dev loss 76.05813992023468
07/09/2021 11:30:31 - INFO - main - Epoch 4, Train loss 0.12450884422287345
Epoch: 50%|█████████████████████████ | 5/10 [00:19<00:20, 4.08s/it]07/09/2021 11:30:34 - INFO - main - Epoch 5, Dev loss 76.5591652393341
07/09/2021 11:30:34 - INFO - main - Epoch 5, Train loss 0.0748452718835324
Epoch: 60%|██████████████████████████████ | 6/10 [00:23<00:15, 3.90s/it]07/09/2021 11:30:38 - INFO - main - Epoch 6, Dev loss 77.40109157562256
07/09/2021 11:30:38 - INFO - main - Epoch 6, Train loss 0.09374479297548532
Epoch: 70%|███████████████████████████████████ | 7/10 [00:26<00:11, 3.77s/it]07/09/2021 11:30:41 - INFO - main - Epoch 7, Dev loss 77.90590262413025
07/09/2021 11:30:41 - INFO - main - Epoch 7, Train loss 0.10057421837700531
Epoch: 80%|████████████████████████████████████████ | 8/10 [00:30<00:07, 3.67s/it]07/09/2021 11:30:44 - INFO - main - Epoch 8, Dev loss 77.9272027015686
07/09/2021 11:30:44 - INFO - main - Epoch 8, Train loss 0.03545364388264716
Epoch: 90%|█████████████████████████████████████████████ | 9/10 [00:33<00:03, 3.62s/it]07/09/2021 11:30:48 - INFO - main - Epoch 9, Dev loss 77.97029888629913
07/09/2021 11:30:48 - INFO - main - Epoch 9, Train loss 0.49601738521596417
Epoch: 100%|█████████████████████████████████████████████████| 10/10 [00:37<00:00, 3.72s/it]

@HHY-ZJU
Copy link

HHY-ZJU commented Jun 14, 2022

hello, I am reusing the code for finetuning the model , such as CBERT. However, the dev loss does not drop while the train loss on train set seems normal. I wonder if this is OK?

here is the fine tuning log:

07/09/2021 11:30:11 - INFO - main - ***** Running training *****
07/09/2021 11:30:11 - INFO - main - Num examples = 200
07/09/2021 11:30:11 - INFO - main - Batch size = 8
07/09/2021 11:30:11 - INFO - main - Num steps = 250
Epoch: 0%| | 0/10 [00:00<?, ?it/s]/root/yanan/env_cbert/lib/python3.6/site-packages/transformers/optimization.py:155: UserWarning: This overload of add_ is deprecated:
add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
add_(Tensor other, *, Number alpha) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
exp_avg.mul_(beta1).add_(1.0 - beta1, grad)
07/09/2021 11:30:15 - INFO - main - Epoch 0, Dev loss 68.01114630699158
07/09/2021 11:30:15 - INFO - main - Epoch 0, Train loss 30.59469723701477
07/09/2021 11:30:15 - INFO - main - Saving model. Best dev so far 68.01114630699158
Epoch: 10%|█████ | 1/10 [00:06<00:54, 6.06s/it]07/09/2021 11:30:20 - INFO - main - Epoch 1, Dev loss 71.50074660778046
07/09/2021 11:30:20 - INFO - main - Epoch 1, Train loss 7.842194274067879
Epoch: 20%|██████████ | 2/10 [00:09<00:42, 5.27s/it]07/09/2021 11:30:24 - INFO - main - Epoch 2, Dev loss 74.61862516403198
07/09/2021 11:30:24 - INFO - main - Epoch 2, Train loss 1.4141736282035708
Epoch: 30%|███████████████ | 3/10 [00:12<00:33, 4.72s/it]07/09/2021 11:30:27 - INFO - main - Epoch 3, Dev loss 75.86035788059235
07/09/2021 11:30:27 - INFO - main - Epoch 3, Train loss 0.6085959081538022
Epoch: 40%|████████████████████ | 4/10 [00:16<00:26, 4.35s/it]07/09/2021 11:30:31 - INFO - main - Epoch 4, Dev loss 76.05813992023468
07/09/2021 11:30:31 - INFO - main - Epoch 4, Train loss 0.12450884422287345
Epoch: 50%|█████████████████████████ | 5/10 [00:19<00:20, 4.08s/it]07/09/2021 11:30:34 - INFO - main - Epoch 5, Dev loss 76.5591652393341
07/09/2021 11:30:34 - INFO - main - Epoch 5, Train loss 0.0748452718835324
Epoch: 60%|██████████████████████████████ | 6/10 [00:23<00:15, 3.90s/it]07/09/2021 11:30:38 - INFO - main - Epoch 6, Dev loss 77.40109157562256
07/09/2021 11:30:38 - INFO - main - Epoch 6, Train loss 0.09374479297548532
Epoch: 70%|███████████████████████████████████ | 7/10 [00:26<00:11, 3.77s/it]07/09/2021 11:30:41 - INFO - main - Epoch 7, Dev loss 77.90590262413025
07/09/2021 11:30:41 - INFO - main - Epoch 7, Train loss 0.10057421837700531
Epoch: 80%|████████████████████████████████████████ | 8/10 [00:30<00:07, 3.67s/it]07/09/2021 11:30:44 - INFO - main - Epoch 8, Dev loss 77.9272027015686
07/09/2021 11:30:44 - INFO - main - Epoch 8, Train loss 0.03545364388264716
Epoch: 90%|█████████████████████████████████████████████ | 9/10 [00:33<00:03, 3.62s/it]07/09/2021 11:30:48 - INFO - main - Epoch 9, Dev loss 77.97029888629913
07/09/2021 11:30:48 - INFO - main - Epoch 9, Train loss 0.49601738521596417
Epoch: 100%|█████████████████████████████████████████████████| 10/10 [00:37<00:00, 3.72s/it]

I have also encountered a similar situation. What is the reason? Have you solved it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants