-
-
Notifications
You must be signed in to change notification settings - Fork 340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate providers for inline completion #669
Comments
Short-term I would suggest two steps:
This is because many chat providers, including SOTA models, work reasonably well as completion providers too. As for prompt templates, the completion and chat prompt templates are separate and configurable on per-provider basis, see: jupyter-ai/packages/jupyter-ai-magics/jupyter_ai_magics/providers.py Lines 317 to 321 in e3cd019
jupyter-ai/packages/jupyter-ai-magics/jupyter_ai_magics/providers.py Lines 343 to 347 in e3cd019
Note that largely arbitrary suffix handling can be applied using the jinja-based prompt template. Larger refactor will likely be desirable at some point but I would suggest that good rationale for performing such a refactor is presented first (e.g. what cannot be achieved or is problematic with the existing approaches) and the detailed plan agreed before starting the work. |
Thanks for the reply. Could you suggest how I could use the code-gecko model from Google VertexAI with the current implementation? The SDK has a separate param for suffix (see https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/code-completion#code-completion-prompt-python_vertex_ai_sdk). I am not sure how to access that param via langchain. |
Well, a hacky but simple idea is that you create a dummy template which looks like: {prefix}@@@{suffix} where prefix, suffix = prompt.split('@@@') and call the API/SDK using these two arguments. |
Yea I was thinking something like that but was hoping I wouldn't need to do that. But sure, that's fine in the meantime. Thanks! I would be looking forward to the two enhancements you suggested, that would help a lot in the short run. |
Problem
I would like to propose having a separate set of providers for inline completion models, similar to the separation between embedding and llm models. In addition to just allowing users to use a different model for chat and inline completion, generally, the inline completion models are specialized models for inline completion such as starcoder, code-llama, code-gecko, or any of the models from https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard. They also typically have a different interface where they can take in an optional suffix either through a separate parameter or a specific prompt template. It can also be unsafe to assume that any LLM can reliably produce code suitable for inline completion with standard prompt templates and pre/post-processing.
Proposed Solution
Create a new base completion provider class where the handling of the InlineCompletionRequest to produce suggestions can be implemented for each model/provider as the prompt templates, pre/post-processing, and handling of suffix can differ for each provider.
Langchain doesn't seem to provide explicit support for these code completion models (unless I am just unaware), so it might not be possible to rely on langchain in the same way as for general LLMs and embeddings. For example, a model like Google's code-gecko takes in a separate input for suffix, while langchain LLMs only can take in a single input.
Additional context
I'll be willing to work on a PR for this if you'd like me to.
The text was updated successfully, but these errors were encountered: