-
-
Notifications
You must be signed in to change notification settings - Fork 341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify inline completion backend #553
Conversation
There were a few other minor issues that I had noticed, but I need to take off for my holiday long weekend. I will be back on Tuesday (1/2/2024) to file an issue. Let me know if you'd like the issue filed sooner. |
On conceptual level this is true. I wonder if this change has implications for local models which do not support concurrency. I wonder if it could mean a breakage when two users request different local models to return simultaneously. In particular this could be the case for models from the same provider. I have not checked, but see this as a real possibility if the model configuration would no longer be shared between users. |
Looks good on the surface (will take another look later), thanks!
I think we want to keep this feature but it can be implemented in a separate PR, along with separation of models per chat and per completer. |
I believe neither the existing implementation nor this PR support non-concurrent models. In both cases we are always calling When using a singleton handler (existing implementation), we can add support simply by replacing
I think the next step is to allow the completion LM (language model) to be freely chosen by the user and stored client-side, with the model sent per-request. We only enforce global state in the chat, since the chat is seen by all users. However, since inline completion is only seen by the user, I see no reason for this state to live server-side. That would make client-side rendering of the selected completion LM way easier, since the client is now the one storing that state. Authentication is the unknown part here. Perhaps for now, authentication & other fields (e.g. |
Hello authors, I do not know if it is appropriate to ask qs about the in-line code feature here. I really like this feature and want to know when the feature will be released. Please let me know if you guys have a separate PR or sth to record all the timeline for this feature. Thanks, ZD |
Here are some reasons to keep the model state server-side:
Of course the initialized model and its state can be cached server-side even if the choice of model stored client side, though invalidation of the cache in a way such that it is not emptied back and forth when two users use different models and yet not leaking is not trivial.
Some fields can be modified on per-request basis (temperature, top_k, etc) but other have to be defined for the model to be initialized (model name, authentication, in the future file indexing); The fields which can be modified per-request make sense to be stored client-side; these which are required for initialization should be stored server-side, and authentication in particular of course should be stored server-side. |
@krassowski Thanks for responding to my feedback. Just to confirm: have you reviewed this yet? If so, I can merge this today and rebase My response is that all of these good suggestions you're raising are still possible just by storing the model ID on the client. IMO, this just makes everything easier: the client doesn't need to call |
I had a quick glance and all looked good but have reduced availability until 12th and will not be able to test locally until then so I would say feel free to merge and we can iterate afterwards. |
@krassowski NP, thanks for taking the time to let me know! I don't think there's any urgency for now, since this branch probably shouldn't be released until JupyterLab 4.1.0 is released. |
* do not import from pydantic directly * refactor inline completion backend
On the other hand, publishing a pre-release (alpha or beta) would allow users to test it with the 4.1.0 beta/rc. Let me know if there is anything I can help with. |
* do not import from pydantic directly * refactor inline completion backend
* Inline code completions (#465) * Draft inline completions implementation (server side) * Implement inline completion provider (front) * Add default debounce delay and error handling (front) * Add `gpt-3.5-turbo-instruct` because text- models are deprecated. OpenAI specifically recommends using `gpt-3.5-turbo-instruct` in favour of text-davinci, text-ada, etc. See: https://platform.openai.com/docs/deprecations/ * Improve/fix prompt template and add simple post-processing * Handle missing `registerInlineProvider`, handle no model in name * Remove IPython mention to avoid confusing languages * Disable suggestions in markdown, move language logic * Remove unused background and clip path from jupyternaut * Implement toggling the AI completer via statusbar item also adds the icon for provider re-using jupyternaut icon * Implement streaming support * Translate ipython to python for models, remove log * Move `BaseLLMHandler` to `/completions` rename to `LLMHandlerMixin` * Move frontend completions code to `/completions` * Make `IStatusBar` required for now, lint * Simplify inline completion backend (#553) * do not import from pydantic directly * refactor inline completion backend * Autocomplete frontend fixes (#583) * remove duplicate definition of inline completion provider * rename completion variables, plugins, token to be more accurate * abbreviate JupyterAIInlineProvider => JaiInlineProvider * bump @jupyterlab/completer and typescript * WIP: fix Jupyter AI completion settings * Fix issues with settings population * read from settings directly instead of using a cache * disable Jupyter AI completion by default * improve completion plugin menu items * revert unnecessary edits to package manifest * Update packages/jupyter-ai/src/components/statusbar-item.tsx Co-authored-by: Michał Krassowski <[email protected]> * tweak wording --------- Co-authored-by: krassowski <[email protected]> --------- Co-authored-by: David L. Qiu <[email protected]>
* Inline code completions (jupyterlab#465) * Draft inline completions implementation (server side) * Implement inline completion provider (front) * Add default debounce delay and error handling (front) * Add `gpt-3.5-turbo-instruct` because text- models are deprecated. OpenAI specifically recommends using `gpt-3.5-turbo-instruct` in favour of text-davinci, text-ada, etc. See: https://platform.openai.com/docs/deprecations/ * Improve/fix prompt template and add simple post-processing * Handle missing `registerInlineProvider`, handle no model in name * Remove IPython mention to avoid confusing languages * Disable suggestions in markdown, move language logic * Remove unused background and clip path from jupyternaut * Implement toggling the AI completer via statusbar item also adds the icon for provider re-using jupyternaut icon * Implement streaming support * Translate ipython to python for models, remove log * Move `BaseLLMHandler` to `/completions` rename to `LLMHandlerMixin` * Move frontend completions code to `/completions` * Make `IStatusBar` required for now, lint * Simplify inline completion backend (jupyterlab#553) * do not import from pydantic directly * refactor inline completion backend * Autocomplete frontend fixes (jupyterlab#583) * remove duplicate definition of inline completion provider * rename completion variables, plugins, token to be more accurate * abbreviate JupyterAIInlineProvider => JaiInlineProvider * bump @jupyterlab/completer and typescript * WIP: fix Jupyter AI completion settings * Fix issues with settings population * read from settings directly instead of using a cache * disable Jupyter AI completion by default * improve completion plugin menu items * revert unnecessary edits to package manifest * Update packages/jupyter-ai/src/components/statusbar-item.tsx Co-authored-by: Michał Krassowski <[email protected]> * tweak wording --------- Co-authored-by: krassowski <[email protected]> --------- Co-authored-by: David L. Qiu <[email protected]>
* Inline code completions (jupyterlab#465) * Draft inline completions implementation (server side) * Implement inline completion provider (front) * Add default debounce delay and error handling (front) * Add `gpt-3.5-turbo-instruct` because text- models are deprecated. OpenAI specifically recommends using `gpt-3.5-turbo-instruct` in favour of text-davinci, text-ada, etc. See: https://platform.openai.com/docs/deprecations/ * Improve/fix prompt template and add simple post-processing * Handle missing `registerInlineProvider`, handle no model in name * Remove IPython mention to avoid confusing languages * Disable suggestions in markdown, move language logic * Remove unused background and clip path from jupyternaut * Implement toggling the AI completer via statusbar item also adds the icon for provider re-using jupyternaut icon * Implement streaming support * Translate ipython to python for models, remove log * Move `BaseLLMHandler` to `/completions` rename to `LLMHandlerMixin` * Move frontend completions code to `/completions` * Make `IStatusBar` required for now, lint * Simplify inline completion backend (jupyterlab#553) * do not import from pydantic directly * refactor inline completion backend * Autocomplete frontend fixes (jupyterlab#583) * remove duplicate definition of inline completion provider * rename completion variables, plugins, token to be more accurate * abbreviate JupyterAIInlineProvider => JaiInlineProvider * bump @jupyterlab/completer and typescript * WIP: fix Jupyter AI completion settings * Fix issues with settings population * read from settings directly instead of using a cache * disable Jupyter AI completion by default * improve completion plugin menu items * revert unnecessary edits to package manifest * Update packages/jupyter-ai/src/components/statusbar-item.tsx Co-authored-by: Michał Krassowski <[email protected]> * tweak wording --------- Co-authored-by: krassowski <[email protected]> --------- Co-authored-by: David L. Qiu <[email protected]>
cc @krassowski
Summary of changes
langchain.pydantic_v1
instead ofpydantic
BaseInlineCompletionHandler
andInlineCompletionHandler
definitionsBaseInlineCompletionHandler
:process_message()
=>handle_request()
stream()
=>handle_stream_request()
Context
The existing implementation has two types of handlers:
BaseICH/DefaultICH
: A singleton that is instantiated and bound toself.settings
when the server extension is initialized.ICH
(inhandlers.py
): The WebSocket handler that accepts requests and calls the appropriate method onBaseICH
. Instantiated by Tornado automatically per-connection.(ICH :=
InlineCompletionHandler
)This pattern was inherited from how Jupyter AI handles chat. However, that pattern of "Tornado handlers calling singletons" only exists in chat because chat is global, meaning that chat-related state had to be shared across all users. This explains why the singleton pattern exists there. However, inline completions don't need to be shared across all users. This means we can actually define
ICH
andBaseICH
in the same class.We can simplify the implementation further by removing the need to track all inline completion sessions in a dictionary. I had to remove the
_modelChanged
signal in the frontend to do this. The existing method of tracking whether a model changed was faulty, as it only emits the signal when we check that the model has changed, not when the model actually is changed by the user in the settings UI. If we do in fact want to keep this feature, we can re-implement it correctly.Lastly,
DefaultICH.process_message()
was rather strange in that it had different behavior depending on whether the request was a stream request. I've chosen to entirely split the interface, and require subclasses to definehandle_request()
andhandle_stream_request()
separately instead.