Revamped Jupyter AI #1157

govinda18 · 2024-12-18T16:42:23Z

Revamped Jupyter AI

Hi team,

We at D.E. Shaw have forked Jupyter AI and made a significant number of changes to greatly enhance the user experience. Our goal was to align Jupyter AI with the capabilities offered by leading AI-based IDEs such as Cursor, Copilot etc. This pull request includes the majority of the changes we implemented.

This proposal is intended to outline these changes and may require some deliberation. We anticipate that this PR will need to be broken down into several smaller PRs, with some design changes to ensure smooth integration. Additionally, we are planning further enhancements, including per notebook chat and inline code generation, details of which are included in this proposal.

The demo showcases the powerful capabilities of Jupyter AI by incorporating notebook context, leveraging the kernel, and enhancing keyboard accessibility (I did not use my mouse at all while creating this).

jai_demo.mp4

Point of contact: @govinda18 @mlucool

Philosophy

Unlike in JupyterAI, IDEs like Cursor or Copilot have gone towards anything in the IDE is fair to be used. We believe this is the correct user experience for sharing context with the LLM. For Jupyter AI to be effective, it should always provide context automatically, further allowing quick code iterations through insertions and replacements.

We also want to leverage the one major difference that sets notebooks and JupyterLab apart from any other IDE - a runtime kernel. This opens up possibilities where you can include any context of any variable declared in your notebook and further inspect methods and objects to understand their usage. Checkout jupyter/enhancement-proposals#128 for our pre-proposal on streamlining the same.

For autocompletion to be helpful, it needs to be aware of the context in previous cells to provide a Copilot-like experience. For both chat and inline completion, we automatically let it send the entire context of the notebook. We further optimize for context window of the LLM. Checkout the current limitations section for more details.

Lastly, we felt that the chat interface feels a little distant and difficult to access. Most power users of JupyterLab would not use a mouse while working with a notebook, but it's currently not easy to use Jupyter AI without mouse intervention.

Overall, we want to bring the chat and the notebook close enough such that the LLM intelligently understands what the user is talking about without the need to manually decide what context needs to be added. This should further integrate into the natural workflow of the user by providing inline code generation and keyboard accessibility.

What has changed?

This proposal aims to make Jupyter AI much more powerful and accessible, bringing a host of new features designed to enhance user productivity and streamline their data science workflows.

Major Features

Inline Code Completion: The Jupyter AI inline completer is now context-aware, utilizing code from both preceding and succeeding cells to provide more accurate suggestions.

Currently, changes are sent from the frontend (reference), but we believe that jupyter_ydoc should handle this more effectively. We encountered some issues related to jupyter-collaboration issue #202 and did not want to wait for it.
Notebook Aware Chat: The chat model is now aware of your active notebook, current cell, and text selection. You can directly ask it to perform actions like "Refactor this code" and it will refactor your notebook or active cell accordingly, saving you valuable time. This was also proposed in Add all notebook to context, #1037.

Again, while currently this is being sent from the frontend, we believe that jupyter_ydoc should handle this more effectively. Refer this prompt to understand more.

Further, we have added the ability to modify prompts directly from the Jupyter AI settings, enabling quicker iterations. Users can also view the exact prompt sent to the LLM by using the View Prompt call-to-action (CTA) in the chat interface.
Keyboard Accessibility: Navigate Jupyter AI with ease using keyboard shortcuts. Use Ctrl + Shift + K to enter chat mode (this is inline with the eventual Ctrl + K for inline code generation) and Escape to exit(ref). You can easily add generated code to your notebook by navigating with Shift + Tab.

For enhanced user experience, we have
- Changed the order of the cell toolbar to be Copy, Replace, Add before, Add after as we anticipate most people would want to add the code after the active cell, so it's just a Shift Tab away. (ref)
- Goes back to the notebook directly after any action which would be the typical next step. (ref)
Variable Insertion: Make the Large Language Model (LLM) aware of your variables effortlessly by inserting them into the chat using the @ symbol. This helps the model better understands your context and provides more accurate suggestions by asking the kernel for runtime information. We felt this should work directly without using a prefix like @var:var_name as it is one of the most powerful benefits of having a live kernel.

This functionality is driven by a variable description registry that knows how to describe any object to an LLM. Additionally, we are working on drafting a proposal to IPython for a generic __llm__ method that can be used by any Python class to describe itself to an LLM. More details in Pre-proposal: standardize object representations for ai and a protocol to retrieve them jupyter/enhancement-proposals#128.

This also enhances user experience by providing an autocomplete of the variables declared in the currently active notebook. (ref)

The current implementation is also pluggable as documented here.

Dropped Features

We suppressed some of the existing features as well. There are different reasons though but we feel each of these features needs to be given a bit more thought before we can allow them generically.

Note that these features still work but are not shown in the autocomplete (ref) or help message (ref).

Support for @file: We believe that adding an entire file in a context needs to be given a bit more thought. Most users would want to add the current notebook in the context which we now do by default. For adding different types of files, there should be some pre-processing. For example, adding a python file would be siginficiantly different that how a csv should be added.
Support for /learn and /ask: There are some fundamental issues with the current implementation:
- The embedding generation occurs in the main thread (last we checked, not sure if it has changed now) thereby blocking the kernel. We often found users shooting themselves in the foot while trying to make the LLM learn a big folder.
- Embedding generation of notebooks would not work great as semantic search on code fails more often than not. A better approach may be to use an LLM based text description of code in case of notebooks.
- /learn and /ask don't scale to a large number of files as accuracy greatly decreases as you add more embeddings. This is just a problem with semantic search in general. Things like hybrid search, reranking and metadata filtering should be researched and added. We found for our internal data that it was too easy to add too much for it to be accurate (adding smaller amount of data was useful but not very practical).
/fix should be replaced with an eventual Fix with AI button that we plan to have whenever an error occurs in the notebook.

Enhancements

Support for resetting the jupyter AI config from the UI (ref)
Support for settings the default completion model from the config (ref)
Change the delete icon for human message to be a menu similar to AI message for scalability with support to copy the prompt sent by the user (ref)
Updated the default prompts for chat input (ref)

Current Limitations

To avoid exceeding the context window when including the entire notebook, we have introduced an abstraction called process_notebook_for_context. Individual providers can implement a method like the one below to optimize the context window:

View Code


  def process_notebook_for_context(model_id: str, code_cells: list[str], active_cell: int | None) -> str:
  """
  Processes the notebook to prepare context-aware code for LLM-based completion.
  
  This method respects the token limit for the notebook context by strategically selecting 
  code from surrounding cells (both prefix and suffix) to ensure the current active cell 
  has the most relevant context.
  
  Steps:
  1. The current active cell is taken entirely as the initial context.
  2. Tokens are allocated for suffix cells and added until the token limit is reached.
  3. Remaining tokens are allocated to prefix cells and added similarly.
  4. Any remaining tokens are used to extend the suffix further if the prefix is fully utilized.
  5. Comments are added to indicate the number of cells hidden above and below the context.
  
  Parameters:
  - model_id (str): The identifier for the LLM model being used.
  - code_cells (list[str]): The list of code cells in the notebook.
  - active_cell (int | None): The index of the currently active cell in the notebook. 
                              Defaults to 0 if not provided.
  
  Returns:
  - str: The context-aware code string, including the active cell, prefix, and suffix 
         code cells, along with comments indicating hidden cells.
  """
  active_cell = active_cell or 0
  code_context = code_cells[active_cell]
  prefix_idx = active_cell - 1
  suffix_idx = active_cell + 1

  total_tokens = get_max_token(model_id) * MAX_NOTEBOOK_TOKENS_PCT

  model_for_counting = model_id
  rem_tokens = total_tokens - get_token_count_by_model(
      code_context, model_for_counting
  )

  max_suffix_tokens = int(total_tokens * PREFIX_SUFFIX_RATIO)
  suffix_code: list[str] = []
  while suffix_idx < len(code_cells):
      token_to_be_used = get_token_count_by_model(
          code_cells[suffix_idx], model_for_counting
      )
      if max_suffix_tokens - token_to_be_used < 0:
          break

      max_suffix_tokens -= token_to_be_used
      suffix_code.append(code_cells[suffix_idx])
      suffix_idx += 1

  rem_tokens += max_suffix_tokens

  prefix_code: list[str] = []
  while prefix_idx > 0:
      token_to_be_used = get_token_count_by_model(
          code_cells[prefix_idx], model_for_counting
      )
      if rem_tokens - token_to_be_used < 0:
          break

      rem_tokens -= token_to_be_used
      prefix_code.append(code_cells[prefix_idx])
      prefix_idx -= 1

  # If any tokens are remaining, add them to the suffix
  while rem_tokens > 0 and suffix_idx < len(code_cells):
      token_to_be_used = get_token_count_by_model(
          code_cells[suffix_idx], model_for_counting
      )
      if rem_tokens - token_to_be_used < 0:
          break

      rem_tokens -= token_to_be_used
      suffix_code.append(code_cells[suffix_idx])
      suffix_idx += 1

  if prefix_idx != -1:
      prefix_code.append(
          f"# Hiding {prefix_idx + 1} more cells above the context provided."
      )

  if suffix_idx != len(code_cells):
      suffix_code.append(
          f"# Hiding {len(code_cells) - suffix_idx} more cells below the context provided."
      )

  return "\n\n".join(prefix_code[::-1] + [code_context] + suffix_code)

This PR has only been tested for the models that stream but it should be easy to extend support for those who do not as well.
Variable Context is limited to basic data types and pandas only. While we default to __str__ for any object, we further aim to expand support for these.

Other ideas we are working on

Inline Code Generation: Inspired by Cursor, we are developing an inline code generation feature in Jupyter AI. Although still in development, here is a GIF demonstrating our vision for this feature (note that some major UI changes are still pending):
Per Notebook Chat: Currently, the chat is shared across notebooks, which creates a poor user experience as chat history becomes irrelevant when switching notebooks. We are working on adding support for maintaining a per-notebook chat to address this issue.
Fix with AI Button: Each error in JupyterLab will have a "Fix with AI" button, which essentially implements an inline version of the /fix command, making it easier for users to resolve errors directly within their workflow.

mlucool · 2024-12-18T18:11:11Z

One more thing: because we call into the kernel, the subshell JEP would benefit this implementation as well, otherwise things like autocomplete and getting LLM descriptions can be blocked

dlqqq · 2024-12-18T22:18:32Z

Thank you @govinda18 and @mlucool for contributing to the future of this project by proposing these big ideas! I really appreciate the immense amount of effort put into the design & implementation of the changes. ❤️

These are all very sound ideas. In fact, I believe we have existing issues for each of them. Here is a quick summary for others in this thread:

Improve inline completions w/ more context
Automatic awareness of open notebooks
Keyboard accessibility
Local variable inclusion in chat input
(TBD) Improving @file to handle large files
(TBD) Improve /learn and /ask
(TBD) Dropping /fix while improving the UX for fixing bugs
(TBD) Add a per-notebook chat

Currently, our team has shifted focus away from Jupyter AI v2 to focus on building the next major release, Jupyter AI v3, which is planned for mid-Feb 2025. We are developing this on the v3-dev branch in the same repo. Because v2 & v3 already deviate significantly, it would likely be impractical to port any large changes from v2 => v3. Therefore these proposals would likely be implemented as v3-exclusive features.

Given this, I'll share some context on v3 and provide some suggestions on how we can move forward on this 🤗.

Jupyter AI v3

In v3, all of the logic for managing chat state in the backend & rendering chats in the frontend has been migrated to a project called jupyter-chat. @brichet from QuantStack (a contracting firm) has led development on this. This migration helps to separate concerns and allow multiple extensions to hook into the same chat.

jupyter-chat mostly borrows the frontend of jupyter-ai, but deviates significantly in the backend implementation. jupyter-chat uses a Yjs shared document to model the chat state. jupyter-collaboration, a dependency of jupyter-chat, automatically syncs this shared chat across all clients and persists the chats as *.chat files. This is the same package which provides RTC capabilities in JupyterLab.

Note: The v3-dev branch already supports multiple chats in the same session. I will start working with others to explore if these files can be "tied" to a notebook to provide the "per-notebook chat" idea proposed here.

Suggestions on moving forward

Out of all the fantastic ideas proposed here, I think local variable inclusion (e.g. @foobar) is the most valuable and least ambiguous. Having this would be a killer feature for Jupyter AI, without question. jupyter-chat also already provides an input suggestion API that reads Jupyter AI's slash commands and shows them when / is pressed. Question: Does local variable inclusion sound like a reasonable first step?

v3 is still early in development, so jupyter-chat will require many changes if we want to see these ideas implemented in v3. To plan this effort, it would be helpful to know how much time is available from each of our teams. Question: How much commitment can you dedicate towards developing these capabilities?

On our side, I am the only person working full-time on Jupyter AI, while @brichet is the only person working full-time on Jupyter Chat.

Finally, it may be helpful to establish a dedicated communication channel for technical discussion as needed. Question: Do you all want a comms channel? AWS has a Slack workspace that allows for external connections; I can explore this if interested.

krassowski · 2024-12-18T22:44:58Z

Finally, it may be helpful to establish a dedicated communication channel for technical discussion as needed.

I don't know if AWS team had a chance to explore the Jupyter Zulip channel but a lot of dev chatter around Jupyter nowadays happens over there (and it is still public).

michaelchia · 2024-12-19T04:11:23Z

@dlqqq please include me if there would be a comms channel on this. I've also been running a hacky patched version of jupyter-ai that has a very similar automatic context feature for the active file and info of variables via the active kernel. I would like to see how will be enabled in v3. I would primarily have interest in how extendable and configurable it would be for developers.

For example, in my version, I only extract dataframe schemas from variables and would not include output of cells in the context to mitigate risk of sending sensitive information to the models. I would hope that I would be able to configure it to do something like that or at minimum have the possibility of monkey patching it to do so.

Revamped Jupyter AI

f6d2cec

govinda18 mentioned this pull request Dec 18, 2024

Revamped Jupyter AI deshaw/jupyter-ai#1

Closed

mlucool mentioned this pull request Dec 18, 2024

Pre-proposal: standardize object representations for ai and a protocol to retrieve them jupyter/enhancement-proposals#128

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revamped Jupyter AI #1157

Revamped Jupyter AI #1157

govinda18 commented Dec 18, 2024

mlucool commented Dec 18, 2024

dlqqq commented Dec 18, 2024

krassowski commented Dec 18, 2024

michaelchia commented Dec 19, 2024

Revamped Jupyter AI #1157

Are you sure you want to change the base?

Revamped Jupyter AI #1157

Conversation

govinda18 commented Dec 18, 2024