You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the implementation only works for batches of prompts of the same input size (input is a tensor and generation starts after it), but it would be nice to input list of prompts of variable lengths.
This requires a masking strategy and a way to let the model know to only save generated tokens because if prompts in a batch are of different lengths, we'd have to start generating from the smallest prompt size position and in other batches those already have tokens in those positions. We need a way to only save the generated tokens and keep the known tokens.
The text was updated successfully, but these errors were encountered:
Currently, the implementation only works for batches of prompts of the same input size (input is a tensor and generation starts after it), but it would be nice to input list of prompts of variable lengths.
This requires a masking strategy and a way to let the model know to only save generated tokens because if prompts in a batch are of different lengths, we'd have to start generating from the smallest prompt size position and in other batches those already have tokens in those positions. We need a way to only save the generated tokens and keep the known tokens.
The text was updated successfully, but these errors were encountered: