Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

感谢开源了数据集,请教一下关于训练的问题 #18

Open
zhy844694805 opened this issue Jul 15, 2024 · 1 comment
Open

Comments

@zhy844694805
Copy link

533831b3cf88ba0b44833b5158f3d5f
验证集损失越来越大,准确率越来越低,能否请教一下你们的训练方法吗?
以下是我适用llama-factory的训练参数:

model

model_name_or_path: /root/autodl-tmp/Qwen2-7B-Chat

method

stage: sft
do_train: true
finetuning_type: lora
lora_target: all
pissa_init: true
pissa_iter: 16
pissa_convert: true

dataset

dataset: train
template: qwen
cutoff_len: 512
max_samples: 200000
overwrite_cache: true
preprocessing_num_workers: 16

output

output_dir: saves/qwen2-7b/lora/pissa
logging_steps: 10
save_steps: 3000
plot_loss: true
overwrite_output_dir: true

train

per_device_train_batch_size: 6
gradient_accumulation_steps: 8
learning_rate: 1.0e-4
num_train_epochs: 20.0
lr_scheduler_type: cosine
warmup_ratio: 0.05
bf16: true
ddp_timeout: 180000000

eval

val_size: 0.05
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 3000

@scutcyr
Copy link
Owner

scutcyr commented Aug 22, 2024

学习率设置太大了,模型提前过拟合了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants