-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Model garden class to handle Llama2 interactions #41
Add Model garden class to handle Llama2 interactions #41
Conversation
Llama2 models hosted on VertexAI return a response object that is parsed incorrectly by the generic VertexAIModelGarden class. Each character of the response is treated as a "prediction", and as such the full response text is not correctly extracted. The model response also includes the original prompt which needs to be stripted, along with the keywords "Prompt:" and "Output:". This Llama2 specific model introduces the following behaviors: - Process full text returned by the model - Extract only the newly generated text from the reponse - Support passing in model parameter values by setting corresponding props on the object, this is a behavior that the core VertexAI objects support, so this adds consistency and good dev. "ergonmics"
@izo0x90 thanks for the contribution! I believe, this bug was fixed in |
I appear to still be seeing this behavior and looking at the code of the installed lib. in the venv and the current code on github, it still iterates over the Let me know if I am not accounting for something else you want me to update/ modify and test? Thanks. Code as appears in installed Poetry output showing updated versions of Test case showing class returning the 'P' character from the full text repose, "Prompt: Some prompt, Output: Actual generated model text": |
How do you import It looks like there was a stupid bug, I'm sorry about it: |
Thanks, my fault for not noticing that the actual updated code had changed to a different module, you are correct on v 0.1.0 and importing directly from the top package namespace I get the full response. ✅ Is it worth modifying this PR (or making a new one) to still include a specialized Llama2 class that can strip the prompt and "field name" text from the response. And also allow for passing in model params by setting them on the model object, this way behavior is consistent with the core Google model handler classes ? Or is model specific classes not something you want in the library ? |
@izo0x90 no worries, and thanks for your help! I definitely don't want to have all 100 models' classes (for every model presented on Model Garden). On the other side, we've introduced a precedent with Gemma :). Maybe what we could do:
wdyt? @markmcd @kurtisvg @sasha-gitg any other thoughts maybe? |
@lkuligin Ya that totally makes sense, having an explosion of classes for every third party model seems like a potential nightmare from maintenance point of view 🤣 . Gemma is a Google model so that also makes sense. I like the idea of passing in post processor on instantiation. Maybe there can be a module that houses post processors for different models and you just import it from there and pass it in to the VertexAIModelGarden constructor?
Another question I have is, what about the ability to pass model parameters to the request by setting object attributes on a VertexAIModelGarden instance, I find that quite intuitive when using these abstractions and is also consistent with the behavior for the core model classes? Or is this also something that was already added to this new version VertexAIModelGarden, I haven't test that lol ? |
Llama2 models hosted on VertexAI return a response object that is parsed incorrectly by the generic VertexAIModelGarden class. Each character of the response is treated as a "prediction", and as such the full response text is not correctly extracted.
The model response also includes the original prompt which needs to be stripped, along with the keywords "Prompt:" and "Output:".
This Llama2 specific model introduces the following behaviors:
Process full text returned by the model
Extract only the newly generated text from the response
Support passing in model parameter values by setting corresponding props on the object, this is a behavior that the core VertexAI objects support, so this adds consistency and good dev. "ergonomics"