You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Are the ngram and ppl scripts currently capable of leveraging multi-GPU setups for inference? If not, are there any planned updates or workarounds that might support this feature?
Additionally, is it possible to integrate vllm support to accelerate the ngram computations?
The text was updated successfully, but these errors were encountered:
Are the ngram and ppl scripts currently capable of leveraging multi-GPU setups for inference? If not, are there any planned updates or workarounds that might support this feature?
Additionally, is it possible to integrate vllm support to accelerate the ngram computations?
The text was updated successfully, but these errors were encountered: