-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Restore Notebook execution progress when a browser page is reloaded #1274
Comments
Thank you @skukhtichev for the great demo during JupyterCon and for opening the discussion to upstream the work you have been doing. I would be great to have a chat during one of our dev calls. When would you be able to join? cc/ @Zsailer @kevin-bates |
Haules |
@echarles I will be happy to discuss the proposal during the dev talk. After @davidbrochart presentation about Jupyter Server at JupyterCon, I realized that the proposal needs to be adapted to the latest Jupyter Server changes (authorization, updated kernel web socket handler, etc.). I want to dive deeper into the server's code and update the proposal. Will the May 25th Jupyter Server call be appropriate to discuss the proposal? |
Doesn't the general principle remain the same. Authentication and Authorization should not come in the picture, does it? The new kernel websocket handler is meant to be more easily extensible, so I guess it should not impact the validity of your proposal.
Sounds good. I will join and we will chat with the people online. |
@skukhtichev, @echarles - I've gone ahead and added this to Thursday's agenda. See you there! |
@kevin-bates I think @skukhtichev was mentioning 25th (next week), not 18th (this week) |
@echarles yes, @kevin-bates Could we reschedule a discussion to the next week (May 25th)? |
Yes, the principle remains the same. There is a limitation with establishing websocket sessions between the Browser and Jupyter Server. There is only one session could be established for a user. If the user opens the same notebook in the new browser tab, then then the previous web socket session is closed. Currently it is similar for other users who are connecting to the same kernel. The authorization logic sets user-specific cookies, so it should be possible to distinguish between users connecting to the same kernel. Implementing this approach will allow support for multiple users and will not interfere with the collaboration feature. |
I'm sorry I missed that. I've updated the agenda such that this discussion is slated for next week (May 25). |
I would also like to join this conversation (will attend the discussion next week). I have worked on similar concept and would like to collaborate on this effort. |
Hi @parul100495 - thank you for sharing your interest in helping! See you next Thursday. (For the sake of others, the Server/Kernels team meeting is open to anyone - all are welcome - no participation required.) |
PS: unfortunately I won't be able to join this week call. Excited to see any progress on this feature. |
Hi all, I recently discovered this proposal and am interested in learning more about how I could help. I work in a plant breeding lab that collaborates frequently with international researchers. These individuals don't always have the most reliable internet connection, and so oftentimes face the frustration of losing their work when in-progress cells are "canceled" due to a network interruption. The way I read @skukhtichev's proposal, it ought to also cover situations like these (i.e. a user reconnects to their notebook after being disconnected for some time, and is able to see their code cells continuing to run and not lose data). I've talked briefly with @davidbrochart on this issue related to the new kernels REST API, as I believe it would also be a potential solution to this problem. What would be the best avenue for a volunteer to dedicate time to assisting with this? I saw that this proposal was discussed at last week's server meeting. |
Hey folks, Since there appears to be interest from multiple people representing multiple organizations, I propose we try to meet regularly over the next couple of months to discuss this topic to 1) collect implementation ideas 2) develop a plan towards an open-source solution 3) coordinate who might be able to work on this. At last week's server meeting, I proposed we reserve the last 20 minutes of the Jupyter Server/Kernels Meeting (8am Pacific) discuss this topic. Would folks on this thread be able to stop by regularly for those 20 minutes? Give this comment a 👍 to signal that you'd like to join. We'll post notes from those discussions here. Thanks! |
@Zsailer Just to double-check, will those 20 minute blocks begin at tomorrow's meeting, or next week's? |
In today's Jupyter Server call, we discussed this issue extensively. I'll do my best to summarize the discussion below based on some notes I took. User story: As a user, I'd like to run a cell (or series of cells) in a notebook, close JupyterLab before the cells finish. When I come back, I'd expect to see
Today's UX: When the user closes/refreshes JupyterLab,
Multiple solutions have been proposed over the past few weeks. Here's my attempt at summarizing below. There are three layers to the problem:
To begin building a solution, we proposed using a new, separate repo in jupyter-server org that
Question: Why can't we solve this problem with just the kernel replay and replay all message to JupyterLab when it reconnects? Answer: I (Zach) believe you can. If the messages are timestamped, you should be able to just replay all messages from the last time the user was connected and the client should collapse this to the current state of the notebook. There are advantages to making the server more notebook document aware though. We can elaborate on this more in a separate thread. Aside comment to keep in mind: kernel gateway / enterprise gateway add an addition passthrough layer where kernel messages can be missed and we need replay options. How does Jupyter's RTC efforts play into this? Y-docs offer a solution for rebuilding a notebook model server-side from log of patches/diffs/messages. Maybe we can leverage this machinery to store and resolve (2) and (3) once we have a message replay system in place? |
I would like to relay here a comment appearing in the meeting note of an important point that is missing in the above summary (thanks for it Zach): the state of the kernel waiting for an user input (e.g. a Python code using |
I think that the issue with |
Hey folks, I'm going to be out-of-the-office next week and will miss the Jupyter Server meeting (6/29). If folks want to still meet and discuss, please feel free to do so! Otherwise, I'll be back for the next meeting on 7/6. Until then, I'll work on setting up a new repo and drafting a loose roadmap of the work ahead of us. Cheers everyone! |
Very glad to hear that! State retention of the JupyterLab front-end and the ipynb file itself is a major issue due to messaging delays and network effects, and will significantly impact the user experience if Jupyter Server is running in the cloud (This is the main problem I had before: JupyterLab always prompts whether to overwrite the file or not). Based on this, I'm in favor of @davidbrochart‘s comment and jupyverse's solution: use I've put jupyter_kernel_executor on hold for now due to a change in focus at work. In this plugin, the user can execute the code through the HTTP interface, and the Jupyter server's handler performs the parsing of the zmq results and write them to the file, using a form that converts http to a localhost websocket connection (we can even connect to another Jupyter server, i.e., a remote kernel). I regard this feature as a port of JupyterLab's ability to execute code and write to a file to Jupyter Server, triggered by a button. The input is file+kernel+cellid and the process is that Jupyter Server establishes a websocket connection and writes the result of the code run to the file continuously.JupyterLab will get the updates through Jupyter's RTC feature. EDIT: This picture simply illustrates my thoughts. Hope this helps. Thank you all. |
I made some progress towards restoring notebook state, using jupyverse/jpterm: Peek.2023-11-09.11-29.mp4 |
Forgive me for not following the development of the project for too long. This is exciting. 👏 Is it because you open another client for writing(as collaborative)? Or have we implemented Jupyter Server to write directly to files or replay messages? (And replaying the message doesn't solve the problem of the output not being saved, it just offers the possibility of a delayed save) UPDATE:
It seems to go further with my comment above that using |
And here is a demo showing full notebook state recovery, including widgets: Peek.2023-11-10.11-29.mp4 |
Having a ydoc in the kernel is great, because this will work with kernel gateway and enterprise gateway too |
Curious to know what's the current status of this? Specially now that jupyterlab/jupyter-collaboration#279 is merged? Thanks! |
Restore Notebook execution progress when a browser page is reload
Problem
Jupyter Notebook/Lab does not restore execution progress after the page is reloaded. As a result there is no option to monitor execution progress and retrieve execution output for long running notebooks.
It happens due to the following reasons:
Notebook/Lab UI generates a new session id every time when a notebook/lab page is reloaded.
Jupyter Server supports replaying kernel messages to the client after the kernel session is reconnected. It is based on the session id set by the client while connecting to the kernel. Jupyter Notebook/Lab UI generates a new session id every time when the notebook page is loaded. So there is no way to replay buffered messages from Jupyter Server after the notebook page was re-opened because there is no way to reconnect to the existing kernel session.
Message IDs for submitted code are not persisted in cells metadata
Kernel message ids are not persisted in the notebook metadata and cleaned up after the notebook window or tab is reloaded. It means that Jupyter Frontend code is not able to link kernel messages to cells. As a result there is no way to show the output and an execution count.
Unsaved cell’s output is missing after a notebook page is reloaded
If the kernel message with the execution result is sent to the browser but the notebook is not saved then output will not be displayed after reloading a page
Proposed Solution
A proposal is to enable restoring sessions for kernel connections and to move a Notebook model (cells metadata) to Jupyter Server and synchronize changes triggered by a Notebooks/Lab UI and a kernel.
Restore kernel sessions
Make a kernel connection independent from the session id provided in web socket
session_id
url argument. A new session id is generated every time when the notebook is reloaded. In this case buffered kernel messages will not be replayed after the Notebook page/tab is re-opened. JupyterLab and Notebook 7 support collaboration mode which allows you to differentiate between notebook users. A user info and a notebook path could be used as a session identifier and will be mapped to the kernel id.Move a Notebook model (cells metadata) to Jupyter Server and synchronize it with a kernel and a notebook opened in UI
Storing a notebook model on Jupyter Server will allow:
There are implementation notes for enabling Notebooks model on Jupyter Server:
The messages are sent via kernel's web socket connection with a new message type (e.g.
nb_state
). ThenZMQChannelHandler
parses incoming messages and forwards notebook state related messages toNotebooksStatesManager
.NotebooksStatesManager
is responsible for synchronizing a notebook state between UI and Jupyter Server. It will also handle kernel messages.msg_id
) is returned to the client (Browser). The client tracks an execution progress based on the message id. Currently the message id is stored in the browser and should not be persisted in the notebook ipynb file because it is relevant only during a runtime. Since each cell has a unique id then it is possible to map message id to cell id, store a message id for each submitted cell on the Jupyter Server and return it to the client when the notebook is reloaded. Then the client will be able to handle incoming kernel messages and display execution progress. Message id is not saved in the ipynb file and is available only during a runtime.Additional context
The image below shows components and data flow for execution restore logic:
ContentsManager
creates a copy of a Notebook model on Jupyter Server and sends it to the client (Notebook/Lab UI)ZMQChannelsHandler
parses messages by the type (e.g.nb_state
type) and forwards messages related to state changes to theNotebooksStatesManager
.NotebooksStatesManager
updates the notebook model stored on the serverZMQchannelsHandler
forwards message to theNotebooksStatesManager
and to the Jupyter Notebook/Lab UIContentsManager
loads a notebook file from the storage (file system, cloud storage, etc.) and merges it (including message ids for submitted code cells execution) with the notebook model stored on Notebook Server. It allows to identify which cells are in executing state and restore execution progress.message ids
from the notebook model which should be saved in the file and saves ipynb file in the storage (msg_id
parameter still exists in the Notebook model stored on Jupyter Server).The text was updated successfully, but these errors were encountered: