-
Notifications
You must be signed in to change notification settings - Fork 297
General overview of Kernel Discovery & Execution in Jupyter (& extension)
- Kernel Specs
- Finding/listing Kernelspecs
- Launching/Starting Kernels
- Jupyter Extension
- Notes
- Raw vs Jupyter
Terminogy used.
A json object that contains information used to start a kernel process.
Here's a sample kernel spec file:
{
"argv": ["python3", "-m", "ipykernel_launcher",
"-f", "{connection_file}"],
"display_name": "Python 3",
"language": "python"
}
The instructions necessary to run a Kernel Process is as simple as run python3
executable with the arguments -m ipykernel_launcher -f {connection_file}
The kernelspec only specifies one language, even if a kernel supports more than one. That doesn't mean a kernel cannot support multiple languages, but the jupyter extension will only be able to use the single specified language.
The runtime and the kernel language need not be the same. E.g. you can build a kernel that supports powershell in .NET as well as Python. Thus, you can have a kernel that requires python, but runs Powershell in the notebook cells. Similarly, there are other kernels written in Python that support non-python.
- Global Kernelspecs
- Python Environment specific Kernelspecs
- Remote Kernel specs
- Listing Python Environments as Kernels
They are located in specific folders (see here https://jupyter-client.readthedocs.io/en/stable/kernels.html#kernel-specs).
Let's call these global kernels
. To find these kernels we don't need the Python runtime at all.
For instance, on a Mac if you place the following kernel in the folder ~/Library/Jupyter/kernels/dummyName
it will show up in Jupyter, even if the run time doesn't exist.
Here's the sample kernelspec
{
"argv": ["bogus", "-f", "{connection_file}"],
"display_name": "Hello World",
"language": "made_up"
}
Thus, registering kernels with Jupyter is super easy and jupyter looks for kernels on the file system in very specific locations.
Along with the static location for global kernel specs, its possibel to have kernel specs that are specific to each Python environment.
Assume you have a Python environment PythonA
, you can have kernels in the following location:
<PythonA Environment Location>/share/jupyter/kernels
.
Similarly if you had an environment named PythonB
, then you'd have kernels in:
<PythonB Environment Location>/share/jupyter/kernels
.
By default, you'll always find a Python3
kernelspec under the above folder. This kernel spec merely launches the corresponding Python environment as a kernel.
Examples of such non-python kernels
- BeakerX has plenty (see here https://github.com/twosigma/beakerx)
- Java, SQL, Scala, etc
When connecting to a remote Jupyter server, we can start kernels on the remote server by using the kernel specs available on the remote machine. This is done via the Jupyter Lab API exposed by the @jupyterlab/services package.
The kernel is started using the APIs provided by the same package.
Note: This is very specific to the Jupyter extension
Assume a user does not have Jupyter installed, but has Python, but would like to run some code against the Python kernel. For such a situation it isn't necessary to register a Python kernel at all, just to start a Python kernel. In such situations we create the kernel specification but this isn't persisted on disc, when it comes to starting the kernel we start the Python process the same way as we start a reguler Python kernel which has a Kernelspec json on disc (mentioned earlier, to start a kernel all we need is the CLI to start a process, and we know how to start Python kernels, hence the JSON file doesn't need to exist on disc).
Exception: There is one exception to this. In some rare cases we do persist the JSON file to disc and we install Jupyter on the users machine and get Jupyter application to start the kernel (just as we get remote Jupyter servers to start a kernel).
- Each kernel open 5 ports
- The CLI used to start a kernel is found in the argv of the kernelspec
- Generally starting the process using the argv is sufficient.
- Python kernels, such as those belonging to a Conda environment must be started by first activating the conda environment.
- Assume we have a kernel named
Python3
under a Conda environment with the argv as['python', '-m', 'ipykernel_launcher', ....
- This should be started by first activating the conda environment & then starting that kernel within the context of that conda environment.
- I.e. activate conda environment, then scrape the environment variables for the process. Next start the kernel using those environment variables.
- This is required so that the Python kernel will inherit the same environment variables available when you launch python after activating the conda environment.
- If this isn't done, then majority of the time python packages will not work (cannot be loaded as paths have not been setup properly, etc)
- Summary - For Conda environments, we must first activate the conda environment before launching the kernel.
- Assume we have a kernel named
- Some non-python kernels such as
Java
kernel from beakerx are located within the Python environment- The argv for the java kernel is
['java', '-cp', ...]
- If we attempt to run this, it will not work, as the
java
executable is not in a globally accessible location. - When installing this kernel, the
java
executable is stored in theConda Environment
. - To get to this path, the conda environment needs to be activated.
- Hence running non-python kernels created in a Conda Environment also require activation of the Conda environment.
- The argv for the java kernel is
- Remote kernels are started using the Jupyter Lab API exposed by the @jupyterlab/services package.
- Step 1 - List all global kernel specs
- Step 2 - Ask Python Extension for all interpreters
- Python extension has the know-how to locate interpreters
- Step 3 - In Jupyter, iterate through each interpreter and look for kernel specs in
<interpreter>/share/jupyter/kernels
- List each of these kernels, but also keep track of their association
- Step 4 - List all Python interpreters as a kernel
- This allows users to view Python environments as kernels and run code against them as though they were Jupyter kernels.
- If starting a global (non-python) kernel spec, then start it using the CLI as defined in
argv
of the kernelspec - If starting a Python kernel spec
- Then launch the python executable using the
argv
defined in the kernelspec - (this is almost identical to how we start non-python kernels)
- Then launch the python executable using the
- If starting a non-Python kernel spec owned by a Python environment (such as a Conda env)
- Then launch the python executable that owns this kernel spec (e.g. Beakerx java kernel that belongs to a Conda environment)
- Scrape the environment variables of this process (this code is in Python extension)
- Launch the using the CLI as defined in
argv
of the kernelspec, but with the env variables extracted earlier - This above process replicates the user:
- Opening a terminal
- Activating the Python env in the terminal
- Running the non-python kernel (at this point terminal has the activated Python env variables loaded)
- Contribution
- Source Code Organization
- Coding Standards
- Profiling
- Coding Guidelines
- Component Governance
- Writing tests
- Kernels
- Intellisense
- Debugging
- IPyWidgets
- Extensibility
- Module Dependencies
- Errors thrown
- Jupyter API
- Variable fetching
- Import / Export
- React Webviews: Variable Viewer, Data Viewer, and Plot Viewer
- FAQ
- Kernel Crashes
- Jupyter issues in the Python Interactive Window or Notebook Editor
- Finding the code that is causing high CPU load in production
- How to install extensions from VSIX when using Remote VS Code
- How to connect to a jupyter server for running code in vscode.dev
- Jupyter Kernels and the Jupyter Extension