Skip to content

General overview of Kernel Discovery & Execution in Jupyter (& extension)

Aaron Munger edited this page Jan 21, 2022 · 12 revisions

Terminogy used.

A json object that contains information used to start a kernel process.

Here's a sample kernel spec file:

{
 "argv": ["python3", "-m", "ipykernel_launcher",
          "-f", "{connection_file}"],
 "display_name": "Python 3",
 "language": "python"
}

The instructions necessary to run a Kernel Process is as simple as run python3 executable with the arguments -m ipykernel_launcher -f {connection_file}

The kernelspec only specifies one language, even if a kernel supports more than one. That doesn't mean a kernel cannot support multiple languages, but the jupyter extension will only be able to use the single specified language.

The runtime and the kernel language need not be the same. E.g. you can build a kernel that supports powershell in .NET as well as Python. Thus, you can have a kernel that requires python, but runs Powershell in the notebook cells. Similarly, there are other kernels written in Python that support non-python.

Finding/listing Kernelspecs

1. Global Kernelspecs

They are located in specific folders (see here https://jupyter-client.readthedocs.io/en/stable/kernels.html#kernel-specs). Let's call these global kernels. To find these kernels we don't need the Python runtime at all.

For instance, on a Mac if you place the following kernel in the folder ~/Library/Jupyter/kernels/dummyName it will show up in Jupyter, even if the run time doesn't exist. Here's the sample kernelspec

{
 "argv": ["bogus", "-f", "{connection_file}"],
 "display_name": "Hello World",
 "language": "made_up"
}

Thus, registering kernels with Jupyter is super easy and jupyter looks for kernels on the file system in very specific locations.

2. Python Environment specific Kernelspecs

Along with the static location for global kernel specs, its possibel to have kernel specs that are specific to each Python environment. Assume you have a Python environment PythonA, you can have kernels in the following location: <PythonA Environment Location>/share/jupyter/kernels. Similarly if you had an environment named PythonB, then you'd have kernels in: <PythonB Environment Location>/share/jupyter/kernels.

By default, you'll always find a Python3 kernelspec under the above folder. This kernel spec merely launches the corresponding Python environment as a kernel.

Examples of such non-python kernels

3. Remote Kernel specs

When connecting to a remote Jupyter server, we can start kernels on the remote server by using the kernel specs available on the remote machine. This is done via the Jupyter Lab API exposed by the @jupyterlab/services package.

The kernel is started using the APIs provided by the same package.

4. Listing Python Environments as Kernels

Note: This is very specific to the Jupyter extension

Assume a user does not have Jupyter installed, but has Python, but would like to run some code against the Python kernel. For such a situation it isn't necessary to register a Python kernel at all, just to start a Python kernel. In such situations we create the kernel specification but this isn't persisted on disc, when it comes to starting the kernel we start the Python process the same way as we start a reguler Python kernel which has a Kernelspec json on disc (mentioned earlier, to start a kernel all we need is the CLI to start a process, and we know how to start Python kernels, hence the JSON file doesn't need to exist on disc).

Exception: There is one exception to this. In some rare cases we do persist the JSON file to disc and we install Jupyter on the users machine and get Jupyter application to start the kernel (just as we get remote Jupyter servers to start a kernel).

Launching/Starting Kernels

  • Each kernel open 5 ports
  • The CLI used to start a kernel is found in the argv of the kernelspec
  • Generally starting the process using the argv is sufficient.
  • Python kernels, such as those belonging to a Conda environment must be started by first activating the conda environment.
    • Assume we have a kernel named Python3 under a Conda environment with the argv as ['python', '-m', 'ipykernel_launcher', ....
    • This should be started by first activating the conda environment & then starting that kernel within the context of that conda environment.
    • I.e. activate conda environment, then scrape the environment variables for the process. Next start the kernel using those environment variables.
    • This is required so that the Python kernel will inherit the same environment variables available when you launch python after activating the conda environment.
    • If this isn't done, then majority of the time python packages will not work (cannot be loaded as paths have not been setup properly, etc)
    • Summary - For Conda environments, we must first activate the conda environment before launching the kernel.
  • Some non-python kernels such as Java kernel from beakerx are located within the Python environment
    • The argv for the java kernel is ['java', '-cp', ...]
    • If we attempt to run this, it will not work, as the java executable is not in a globally accessible location.
    • When installing this kernel, the java executable is stored in the Conda Environment.
    • To get to this path, the conda environment needs to be activated.
    • Hence running non-python kernels created in a Conda Environment also require activation of the Conda environment.
  • Remote kernels are started using the Jupyter Lab API exposed by the @jupyterlab/services package.

Jupyter Extension

How does Juptyer extension search for kernels

  • Step 1 - List all global kernel specs
  • Step 2 - Ask Python Extension for all interpreters
    • Python extension has the know-how to locate interpreters
  • Step 3 - In Jupyter, iterate through each interpreter and look for kernel specs in <interpreter>/share/jupyter/kernels
    • List each of these kernels, but also keep track of their association
  • Step 4 - List all Python interpreters as a kernel
    • This allows users to view Python environments as kernels and run code against them as though they were Jupyter kernels.

How does Juptyer extension start a kernel

  • If starting a global (non-python) kernel spec, then start it using the CLI as defined in argv of the kernelspec
  • If starting a Python kernel spec
    • Then launch the python executable using the argv defined in the kernelspec
    • (this is almost identical to how we start non-python kernels)
  • If starting a non-Python kernel spec owned by a Python environment (such as a Conda env)
    • Then launch the python executable that owns this kernel spec (e.g. Beakerx java kernel that belongs to a Conda environment)
    • Scrape the environment variables of this process (this code is in Python extension)
    • Launch the using the CLI as defined in argv of the kernelspec, but with the env variables extracted earlier
    • This above process replicates the user:
      • Opening a terminal
      • Activating the Python env in the terminal
      • Running the non-python kernel (at this point terminal has the activated Python env variables loaded)
Clone this wiki locally