Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubeflow Notebooks 2.0 (Kubeflow Workspaces) #85

Open
thesuperzapper opened this issue Jun 3, 2023 · 41 comments
Open

Kubeflow Notebooks 2.0 (Kubeflow Workspaces) #85

thesuperzapper opened this issue Jun 3, 2023 · 41 comments

Comments

@thesuperzapper
Copy link
Member

thesuperzapper commented Jun 3, 2023

/kind feature

Here is my proposal for Kubeflow Notebooks 2.0, which can be called "Kubeflow Workspaces".

Frequently Asked Questions


What is Notebooks 2.0?

Kubeflow Notebook 2.0 is the next evolution of Kubeflow Notebooks.
It makes running IDEs on Kubernetes for ML/AI significantly better for users and cluster admins alike.


Where can I find Notebooks 2.0?

We are developing Notebooks 2.0 in kubeflow/notebooks on the notebooks-v2 branch.

Here are direct links to each component's code:


How can I contribute to Notebooks 2.0?

We welcome code contributions and ideas! See our tracking board for what is currently being worked on, look for unassigned tasks and request to be assigned them.

If you don't see any unassigned tasks, take a look at the current state of the code and make some suggestions (by creating issues on the kubeflow/notebooks repo).


When will Notebooks 2.0 be ready?

We hope to ship a widely available alpha version as an optional component of Kubeflow 1.10 (ETA early 2025).

However, we will release a testing version as soon as we have an end-to-end working platform (ETA late 2024), subscribe to this issue for updates and to help us test.


Motivation

The main idea has always been to make the Notebook CRD not just a wrapper around PodSpec, with the goal of abstracting away the Kubernetes resources from end users, while also giving cluster admins the ability to define a set of "templates" that end users can choose from.

The main benefits of this approach are:

  • end-users can create Workspaces without needing to know anything about Kubernetes, because the UI literally becomes 3 dropdowns and 2 volume mounts:
    • Choose a WorkspaceKind (e.g. "jupyter-lab", "vs-code", "rstudio")
    • Choose an image from the approved list
    • Choose a pod-config (e.g. "small_cpu", "big_gpu")
    • Create/Mount a home volume (optional)
    • Create/Mount data volumes (optional)
  • cluster-admins can update the definitions of WorkspaceKinds without breaking existing Workspaces:
    • they simply add a new image/pod-config option, and redirect the old one to the new one
    • we can even provide a config to make the controller wait for a Workspace to restart before applying the redirect
      (we can also make the Spawner UI display a warning that the Workspace needs to be restarted to get the new config)
  • scheduling Pods with GPU resources often requires a number of Pod configs to be correctly aligned (e.g. tolerating a taint, and setting a resource limit), collecting all these configs into a single "pod-config" allows cluster-admins to provide drop-down options, rather than requiring users to understand the structure of their Kubernetes cluster.
  • resizing an existing notebook is effectively impossible right now, because all the nice "spawner configs" are only available at the time you create the Notebook, once it's spawned you have a PodSpec.
    • With WorkspaceKinds, we can make an "edit" button on the UI which allows you to pick from the current "pod-configs" and "images" available for the WorkspaceKind.

Implementation

For more detailed information about the design, please see:

New Components

  • a new Workspace Controller:
    • Will manage and reconcile the new CRDs
    • Will provide webhooks (for validation of CRD patches)
    • Will be written in GoLang with Kubebuilder
  • a new Workspace Backend API
    • Will be the interface between the frontend and Kubernetes
    • Will allow programmatic management of Notebooks, and allow easier replacement of frontend.
    • Will be a REST API written in Golang
  • a new Workspace Frontend:
    • Will let users overview the workspaces in a namespace
    • Will let users spawn, edit and connect to Workspaces
    • Will be written in JS (React)

New CRDs

The high-level overview is to split the Notebook CRD into these two CRDs:

  • Workspace (namespaced resource)
    • this is the resource that end-users create via the "Workspace Spawner UI" or kubectl
    • it is NOT a wrapper around PodSpec
  • WorkspaceKind (cluster resource)
    • this is the resource that cluster-admins create
    • it specifies the template for a Workspace (e.g. "JupyterLab", "VSCode", "RStudio")
    • initially, we would only support a "podTemplate" kind, which is very similar to the existing Notebook CRD,
      but in the future, we could support other types of templates (e.g. "helmTemplate")
@thesuperzapper
Copy link
Member Author

@kubeflow/wg-notebooks-leads I would appreciate any feedback or ideas on this proposal.

@thesuperzapper
Copy link
Member Author

@kimwnasptd we (that is, @juliusvonkohout, @tzstoyanov, and me) discussed this in the NB meeting today and are thinking that we can target this for Kubeflow 1.9.

I would love to get your general feedback on this proposal, and if you think it looks ok, then we can discuss some specifics and get a finalized design doc.

I think the general approach would be that we keep "kubeflow notebooks" as a "deprecated" component for 1.9, and add a new component called "kubeflow workspaces", which will have:

  1. A new controller for the Workspace and WorkspaceKind CRDS
  2. A new UI to manage spawning/editing/pausing/etc them
  3. A new "component" section on the docs website

@kimwnasptd
Copy link
Member

@thesuperzapper @juliusvonkohout @tzstoyanov I'm good with the way of implementing this as a new CRD, and web app since this will be the least intrusive implementation. Users will be able to pick and choose which one they'll want, until we are fully confident on using Workspaces by default.

What I'd like to confirm before committing to this effort would be what are the criteria for for considering this as the default?
* Have culling?
* Have a web app that allows us to have the same functionality as the current one?

@juliusvonkohout
Copy link
Member

@mishraprafful and kubeflow/kubeflow#6734

@kabartay
Copy link

kabartay commented Sep 5, 2023

Thanks for this new changes. Looks good.

@thesuperzapper
Copy link
Member Author

Calling anyone who wants to help move this feature forward!

Please reply if you are willing/able to help with any of the following areas:

  • Frontend (JS - Angular) (Python - Flask):
    • We will need to add 2 new crud-web-apps, to manage and view the new resources.
      • The Workspace app will behave similarly to the existing Notebooks one, and will mostly be about spawning and managing the state of Workspace instances in each profile namespace.
      • The WorkspaceKind app can start with the goal of only showing the current WorkspaceKinds registered in the cluster, and in the future can allow admins to change them graphically.
  • Backend (Golang - Kubebuilder) (Istio):
    • We will need to build a new controller for the Workspace and WorkspaceKind resources.
      • We should use Kubebuilder like we already do in the notebook-controller
      • The Workspace controller will also need to create the Istio VirtualServices/etc to enable access to the notebook web interfaces.
    • We will also probably need to create Webhooks to achieve some of what we are doing, which can also be created using Kubebuilder.
  • Docs (Markdown and Images):
    • Docs are just as important as the code! Otherwise, people won't know about the awesome new features!
    • We will need to add a new "component" section of the Kubeflow website for "Kubeflow Workspace".

Once we have a few people willing to help, we can organize a specific call to flesh out a doc with:

  • User Stories
  • Controller Architecture/Behaviour
  • UI Mockups
  • Alternatives / Problems

@mishraprafful and @guohaoyu110 mentioned that they might be interested in contributing to this, also, there are some others who might be like @punkerpunker @helloericsf @wjhhuizi @wadhah101

@guohaoyu110
Copy link

  • Backend (Golang - Kubebuilder) (Istio)

I am interested in this part: Backend (Golang - Kubebuilder) (Istio)

@thesuperzapper
Copy link
Member Author

Also, @vvnpan @apo-ger may be interested in helping this effort, given they have raised large PRs on the old Notebooks.

@mishraprafful
Copy link

I am interested to take up any of these following these areas:

  • Frontend (Python - Flask)
  • Backend (Golang - Kubebuilder) (Istio)
  • Docs (Markdown and Images)

@SachinVarghese
Copy link

I'm happy to collaborate on creating the backend+docs (workspace controller) for this new proposal

@vikas-saxena02
Copy link

I am happy to contribute to the documentation if you guys need help with that.

@thesuperzapper
Copy link
Member Author

Hey all, now that Kubeflow 1.8 is out, we are going to start seriously working on Kubeflow Notebooks 2.0!

The first step will be to organize and regular meetings with the people who want to contribute and/or give feedback.

I think the first few meetings will be to finalize the design and requirements. Then next year, we can split everything down into tasks and assign them to specific people, so the meeting just becomes a check-in.

Join the next community call (12 December 2023 @ 8:00am PT) so we can organize the first few meetings.

Once we have organized the meeting we'll send it out on the Kubeflow-discuss mailing list.

Also please join the Kubeflow slack and the #kubeflow-notebooks channel which we can use for discussing ideas.


Also just some notes so that I don't forget:

  • we need to discuss configurable service account bindings for the pod (rather than just giving default-editor to all notebooks), at very least we should allow the admin to set all of notebooks in a specific profile to viewer only
  • we need to discuss if we want to have more granular assignments of access to specific notebooks for specific users (rather than just everyone in a profile can connect to all of the notebooks)

@thesuperzapper
Copy link
Member Author

For those wanting to help with Kubeflow Notebooks 2.0, please see the previous comment:

Please try and get to the next kubeflow community call, so we can organize a meeting time.

Also please join the kubeflow slack (especially if you can't attend the community call), and make yourself known on the #kubeflow-notebooks channel, so we can find a good time that works for the "Notebooks 2.0 catchup" meetings).

@SachinVarghese
@Talador12
@apo-ger
@guohaoyu110
@helloericsf
@juliusvonkohout
@kimwnasptd
@mishraprafful
@punkerpunker
@tzstoyanov
@vikas-saxena02
@vvnpan
@wadhah101
@wjhhuizi

@vikas-saxena02
Copy link

@thesuperzapper I am more than happy to be part of the initiative. I will try my best to attend the community call tonight.

@StefanoFioravanzo
Copy link
Member

@thesuperzapper I have a high-level observation on the terminology: why did you choose the name "Workspace"? I think the proposed rename to "Workbench" was well received and agreed upon here:

The term "workspace" is generally used to refer to a project or a logical space that contains resources and user data. In Kubeflow, I believe the closest thing to a workspace we have is the namespace.

@umka1332
Copy link

umka1332 commented Jan 7, 2024

I find this proposal very bad, since you are removing options that user have before. Now they will be limited to configurations that were created by admins instead of having the ability to create the thing that they need.
It's like adding complexity and removing flexibility at the same time.
It would be much better to have some presets, from which user can choose, but still be able to specify all needed values by themselves.

@thesuperzapper
Copy link
Member Author

Hey all,

Please attend the Notebooks Working Group Meeting (Thursday, 18 January, 2024 @ 8:00am PT) if you want to help with Kubeflow Notebooks 2.0!

This will be the first meeting, and our goal will be to start an architecture doc and discuss ideas.


I think the overall steps to get Notebooks 2.0 shipped are:

  • Step 1: organize a "Kubeflow Notebooks 2.0" catchup meeting.
  • Step 2: finalize the design (by discussing requirements and implementation)
  • Step 3: split the design up into tasks, and allocate people
  • Step 4: continue regular catch-ups (to monitor progress)
  • Step 5: ship as part of Kubeflow 1.9

@SachinVarghese
@Talador12
@apo-ger
@guohaoyu110
@helloericsf
@juliusvonkohout
@kimwnasptd
@mishraprafful
@punkerpunker
@tzstoyanov
@vikas-saxena02
@vvnpan
@wadhah101
@wjhhuizi
etc.

@thesuperzapper
Copy link
Member Author

thesuperzapper commented Jan 18, 2024

Hey all,

We have started writing up the design document for Kubeflow Notebooks 2.0.
Please review it and feel free to make comments/suggestions.

The next meeting is Thursday 25th January @ 8 AM PT / Thursday 25th January @ 4 PM PT
Where we will continue finalizing the document and to start assigning tasks to contributors.

PS: we will use the Notebooks WG meeting, to get the invite, either join kubeflow-discuss (if you use Google Calendar) or manually add the "Kubeflow Community Calendar" to yours

@juliusvonkohout
Copy link
Member

@thesuperzapper Thursday 25th January @ 8 AM PT is the manifests wg meeting. are you sure you do not mean Febuary 1 ?

@vikas-saxena02
Copy link

vikas-saxena02 commented Jan 19, 2024 via email

@thesuperzapper
Copy link
Member Author

@juliusvonkohout I did not see the Manifests WG meeting at that time!


Currently, the Notebooks WG Meeting only runs every two weeks, we need to make it weekly while we work on Kubeflow Notebooks 2.0.

I think we can add a new bi-weekly 4:00PM PT meeting starting next week, which would be aimed at US + APAC attendees (but impossible for EU attendees). It gives everyone a chance to attend and does not overlap with the existing Manifests WG meeting.

I have made a PR that will update the community calendar to reflect this:

@thesuperzapper
Copy link
Member Author

@thesuperzapper Have you finalized whether to use GoLang or Python? Also, have you decided between using React or Angular?

@WYGIN, for the front end, we have a number of people more familiar with React who are planning to contribute, so unless that changes, we probably will go with React.

It's still not clear which is going to be better (Go or Python) for the back-end of the front-end (not the controller, which will be in Go + Kubebuilder). .

What we choose will largely depend on who's available to work on it. However, because it's much easier to interact with Kubernetes via Go, and Go is a lot more scalable, I do have a slight preference for Go, if both options have contributors available.

@juliusvonkohout
Copy link
Member

@thesuperzapper

i saw https://github.com/thesuperzapper/kubeflow-notebooks-v2-design/blob/667275bdbf62e6f8a3af73ea302724f8430cad22/crds/workspace-kind.yaml#L83

but i think as discussed in the meetings and documents something additional like maxSeconds: 2*86400 or so is useful in a lot of enterprise environments where we want to prevent Workspaces being abused for long-running jobs and terminate them independent of the activity after 48 hours or so.

@thesuperzapper
Copy link
Member Author

Just tagging @apo-ger based on the discussion in kubeflow/kubeflow#6927 (comment), we might consider adding a third activity probe based on "istio metrics" (e.g. the last time the Pod got an HTTP request).

Note, this will not be in the initial release of Notebooks 2.0 (unless you want to contribute it @apo-ger), but might be in a future one, once we agree on the implementation details.

@krishnakaushik195
Copy link

Hi, this is kaushik
there is a small idea from my side for the future development
how about adding the integrated LLM or any multimodel like AI

@krishnakaushik195
Copy link

hlo team
i want to contribute to the flask + react and LLM
so can some one say how do i start the contributing to contribute

@thesuperzapper
Copy link
Member Author

@krishnakaushik195 we are currently developing Notebooks 2.0 in the notebooks-v2 branch of the kubeflow/notebooks repo.

Please be aware, it's NOT ready for usage yet (we are still actively developing it). If you want to help, you should attend the Notebooks Working Group (WG) meetings know what we are actively working on:

Get invited to the Notebooks WG meeting (on Google Calander) by joining the kubeflow-discuss Google Group. (NOTE: this will also invite you to many other community events, which you can decline or attend)

@krishnakaushik195
Copy link

Yeah sir got it
Thank you

@andreyvelich
Copy link
Member

/transfer notebooks

Copy link

@thesuperzapper: The label(s) kind/feature cannot be applied, because the repository doesn't have them.

In response to this:

/kind feature

Here is my proposal for Kubeflow Notebooks 2.0, which can be called "Kubeflow Workspaces".

Frequently Asked Questions


What is Notebooks 2.0?

Kubeflow Notebook 2.0 is the next evolution of Kubeflow Notebooks.
It makes running IDEs on Kubernetes for ML/AI significantly better for users and cluster admins alike.


Where can I find Notebooks 2.0?

We are developing Notebooks 2.0 in kubeflow/notebooks on the notebooks-v2 branch.

Here are direct links to each component's code:


How can I contribute to Notebooks 2.0?

We welcome code contributions and ideas! See our tracking board for what is currently being worked on, look for unassigned tasks and request to be assigned them.

If you don't see any unassigned tasks, take a look at the current state of the code and make some suggestions (by creating issues on the kubeflow/notebooks repo).


When will Notebooks 2.0 be ready?

We hope to ship a widely available alpha version as an optional component of Kubeflow 1.10 (ETA early 2025).

However, we will release a testing version as soon as we have an end-to-end working platform (ETA late 2024), subscribe to this issue for updates and to help us test.


Motivation

The main idea has always been to make the Notebook CRD not just a wrapper around PodSpec, with the goal of abstracting away the Kubernetes resources from end users, while also giving cluster admins the ability to define a set of "templates" that end users can choose from.

The main benefits of this approach are:

  • end-users can create Workspaces without needing to know anything about Kubernetes, because the UI literally becomes 3 dropdowns and 2 volume mounts:
    • Choose a WorkspaceKind (e.g. "jupyter-lab", "vs-code", "rstudio")
    • Choose an image from the approved list
    • Choose a pod-config (e.g. "small_cpu", "big_gpu")
    • Create/Mount a home volume (optional)
    • Create/Mount data volumes (optional)
  • cluster-admins can update the definitions of WorkspaceKinds without breaking existing Workspaces:
  • they simply add a new image/pod-config option, and redirect the old one to the new one
  • we can even provide a config to make the controller wait for a Workspace to restart before applying the redirect
    (we can also make the Spawner UI display a warning that the Workspace needs to be restarted to get the new config)
  • scheduling Pods with GPU resources often requires a number of Pod configs to be correctly aligned (e.g. tolerating a taint, and setting a resource limit), collecting all these configs into a single "pod-config" allows cluster-admins to provide drop-down options, rather than requiring users to understand the structure of their Kubernetes cluster.
  • resizing an existing notebook is effectively impossible right now, because all the nice "spawner configs" are only available at the time you create the Notebook, once it's spawned you have a PodSpec.
    • With WorkspaceKinds, we can make an "edit" button on the UI which allows you to pick from the current "pod-configs" and "images" available for the WorkspaceKind.

Implementation

For more detailed information about the design, please see:

New Components

  • a new Workspace Controller:
    • Will manage and reconcile the new CRDs
    • Will provide webhooks (for validation of CRD patches)
    • Will be written in GoLang with Kubebuilder
  • a new Workspace Backend API
    • Will be the interface between the frontend and Kubernetes
    • Will allow programmatic management of Notebooks, and allow easier replacement of frontend.
    • Will be a REST API written in Golang
  • a new Workspace Frontend:
    • Will let users overview the workspaces in a namespace
    • Will let users spawn, edit and connect to Workspaces
    • Will be written in JS (React)

New CRDs

The high-level overview is to split the Notebook CRD into these two CRDs:

  • Workspace (namespaced resource)
  • this is the resource that end-users create via the "Workspace Spawner UI" or kubectl
  • it is NOT a wrapper around PodSpec
  • WorkspaceKind (cluster resource)
  • this is the resource that cluster-admins create
  • it specifies the template for a Workspace (e.g. "JupyterLab", "VSCode", "RStudio")
  • initially, we would only support a "podTemplate" kind, which is very similar to the existing Notebook CRD,
    but in the future, we could support other types of templates (e.g. "helmTemplate")

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@shalberd
Copy link

Hi all, is there a way in the design currently in the new kind: WorkSpaceKind or whatever to define common labels for pods centrally? podSpec.metadata.labels .... added for all pods ever spawned? See my request at kubeflow/community#800

@thesuperzapper
Copy link
Member Author

Hi all, is there a way in the design currently in the new kind: WorkSpaceKind or whatever to define common labels for pods centrally? podSpec.metadata.labels .... added for all pods ever spawned? See my request at kubeflow/community#800

@shalberd yes, it's possible to define common labels/annotations for all pods of a WorkspaceKind in Notebooks 2.0.

This is achieved by setting the following fields:

apiVersion: kubeflow.org/v1beta1
kind: WorkspaceKind
metadata:
  name: jupyterlab
spec:
  ...

  podTemplate:

    ## metadata for Workspace Pods (MUTABLE)
    ##
    podMetadata:
      labels:
        my-workspace-kind-label: "my-value"
      annotations:
        my-workspace-kind-annotation: "my-value"

  ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests