We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi, thank you so much for releasing this wonderful codebase. When I'm trying to run pretrain_llama_7b on some v3-tpu pod, I got this error:
ERROR: Accessing retired flag 'jax_enable_async_collective_offload'
It seems related to the flag specified before launching the job:
export LIBTPU_INIT_ARGS='--xla_jf_spmd_threshold_for_windowed_einsum_mib=0 \ --xla_tpu_spmd_threshold_for_allgather_cse=10000 \ --xla_tpu_spmd_rewrite_einsum_with_reshape=true \ --xla_enable_async_all_gather=true \ --jax_enable_async_collective_offload=true \ --xla_tpu_enable_latency_hiding_scheduler=true TPU_MEGACORE=MEGACORE_DENSE'
I am wondering if these flags are necessary and if some could cause the error? Thank you very much for your time and help!
The text was updated successfully, but these errors were encountered:
Same problem here.
Sorry, something went wrong.
Did you find the solution. I have the same error
No branches or pull requests
Hi, thank you so much for releasing this wonderful codebase. When I'm trying to run pretrain_llama_7b on some v3-tpu pod, I got this error:
It seems related to the flag specified before launching the job:
I am wondering if these flags are necessary and if some could cause the error? Thank you very much for your time and help!
The text was updated successfully, but these errors were encountered: