Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Model garden class to handle Llama2 interactions #41

Closed

Conversation

izo0x90
Copy link

@izo0x90 izo0x90 commented Feb 27, 2024

Llama2 models hosted on VertexAI return a response object that is parsed incorrectly by the generic VertexAIModelGarden class. Each character of the response is treated as a "prediction", and as such the full response text is not correctly extracted.

The model response also includes the original prompt which needs to be stripped, along with the keywords "Prompt:" and "Output:".

This Llama2 specific model introduces the following behaviors:

  • Process full text returned by the model

  • Extract only the newly generated text from the response

  • Support passing in model parameter values by setting corresponding props on the object, this is a behavior that the core VertexAI objects support, so this adds consistency and good dev. "ergonomics"

Screenshot 2024-02-27 at 11 43 50 AM

Screenshot 2024-02-27 at 11 44 35 AM

Screenshot 2024-02-27 at 11 45 07 AM

Llama2 models hosted on VertexAI return a response object that is parsed incorrectly by the generic
VertexAIModelGarden class. Each character of the response is treated as a "prediction", and as such the
full response text is not correctly extracted.

The model response also includes the original prompt which needs to be stripted, along with the
keywords "Prompt:" and "Output:".

This Llama2 specific model introduces the following behaviors:

- Process full text returned by the model

- Extract only the newly generated text from the reponse

- Support passing in model parameter values by setting corresponding props on the object,
  this is a behavior that the core VertexAI objects support, so this adds consistency and
  good dev. "ergonmics"
@lkuligin
Copy link
Collaborator

@izo0x90 thanks for the contribution!

I believe, this bug was fixed in 0.0.6, can you give it a try, please?

@izo0x90
Copy link
Author

izo0x90 commented Feb 27, 2024

@lkuligin

@izo0x90 thanks for the contribution!

I believe, this bug was fixed in 0.0.6, can you give it a try, please?

I appear to still be seeing this behavior and looking at the code of the installed lib. in the venv and the current code on github, it still iterates over the result when calling _parse_prediction. As such this issue will still occur to my interpretation and testing.

Let me know if I am not accounting for something else you want me to update/ modify and test? Thanks.

Code as appears in installed 0.0.6 lib in my projects venv:
Screenshot 2024-02-27 at 12 37 12 PM

Poetry output showing updated versions of langchain-google-vertexai and langchain:
Screenshot 2024-02-27 at 12 38 38 PM

Test case showing class returning the 'P' character from the full text repose, "Prompt: Some prompt, Output: Actual generated model text":
Screenshot 2024-02-27 at 12 38 46 PM

@lkuligin
Copy link
Collaborator

How do you import VertexAIModelGarden?
It looks like if you import it as from langchain_google_vertexai import VertexAIModelGarden then everything works fine.

It looks like there was a stupid bug, I'm sorry about it:
#45

@izo0x90 izo0x90 closed this Feb 28, 2024
@izo0x90
Copy link
Author

izo0x90 commented Feb 28, 2024

How do you import VertexAIModelGarden? It looks like if you import it as from langchain_google_vertexai import VertexAIModelGarden then everything works fine.

It looks like there was a stupid bug, I'm sorry about it: #45

@lkuligin

Thanks, my fault for not noticing that the actual updated code had changed to a different module, you are correct on v 0.1.0 and importing directly from the top package namespace I get the full response. ✅

Is it worth modifying this PR (or making a new one) to still include a specialized Llama2 class that can strip the prompt and "field name" text from the response. And also allow for passing in model params by setting them on the model object, this way behavior is consistent with the core Google model handler classes ? Or is model specific classes not something you want in the library ?

Screenshot 2024-02-28 at 2 02 28 PM

@lkuligin
Copy link
Collaborator

lkuligin commented Feb 28, 2024

@izo0x90 no worries, and thanks for your help!
I'm not sure we should be always stripping things since we should behave close to the underlying model (e.g., to make it easier for users to switch from one provider to another).

I definitely don't want to have all 100 models' classes (for every model presented on Model Garden). On the other side, we've introduced a precedent with Gemma :).

Maybe what we could do:

  1. add a separate flag (False by default) whether to postprocess the output
  2. an original response an underlying model name - we could introduce a post-processing function that does something depending on this name (and make it part of VertexAIModelGarden itself)

wdyt?

@markmcd @kurtisvg @sasha-gitg any other thoughts maybe?

@izo0x90
Copy link
Author

izo0x90 commented Feb 29, 2024

@izo0x90 no worries, and thanks for your help! I'm not sure we should be always stripping things since we should behave close to the underlying model (e.g., to make it easier for users to switch from one provider to another).

I definitely don't want to have all 100 models' classes (for every model presented on Model Garden). On the other side, we've introduced a precedent with Gemma :).

Maybe what we could do:

  1. add a separate flag (False by default) whether to postprocess the output
  2. an original response an underlying model name - we could introduce a post-processing function that does something depending on this name (and make it part of VertexAIModelGarden itself)

wdyt?

@lkuligin Ya that totally makes sense, having an explosion of classes for every third party model seems like a potential nightmare from maintenance point of view 🤣 . Gemma is a Google model so that also makes sense.

I like the idea of passing in post processor on instantiation. Maybe there can be a module that houses post processors for different models and you just import it from there and pass it in to the VertexAIModelGarden constructor?

from langchain_google_vertexai.post_processors import llama2bhf_post_process

llm = VertexAIModelGarden(post_processor=llama2bhg_post_process)

Another question I have is, what about the ability to pass model parameters to the request by setting object attributes on a VertexAIModelGarden instance, I find that quite intuitive when using these abstractions and is also consistent with the behavior for the core model classes? Or is this also something that was already added to this new version VertexAIModelGarden, I haven't test that lol ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants