diff --git a/doc/changes/changes_2.1.0.md b/doc/changes/changes_2.1.0.md index e0a8ac7e..b5ed4ffc 100644 --- a/doc/changes/changes_2.1.0.md +++ b/doc/changes/changes_2.1.0.md @@ -16,6 +16,7 @@ Version: 2.1.0 * 277 Added the SaaS database parameters to the configuration page. * 279 Made the notebooks tests running in SaaS as well as in the Docker-DB. +* 19 Added SLC notebook ## Security diff --git a/doc/developer_guide/testing.md b/doc/developer_guide/testing.md index e3769924..11773957 100644 --- a/doc/developer_guide/testing.md +++ b/doc/developer_guide/testing.md @@ -1,31 +1,62 @@ -### Tests +# Tests -XAL comes with a number of tests in directory `test`. -Besides, unit and integrations tests in the respective directories -there are tests in directory `codebuild`, see [Executing AWS CodeBuild](ci.md#executing-aws-codebuild). +XAIL comes with a number of tests in the directory `test`. Besides, unit and integration tests in the respective directories, there are tests in the directory `codebuild`, see [Executing AWS CodeBuild](ci.md#executing-aws-codebuild). -# Speeding up Docker-based Tests +## Speeding up Docker-based Tests -Creating a docker image is quite time-consuming, currently around 7 minutes. In order to use an existing -docker image in the tests in `integration/test_create_dss_docker_image.py` -simply add CLI option `--dss-docker-image` when calling `pytest`: +Creating a docker image is quite time-consuming, currently around 7 minutes. + +To get test results faster, you can use an existing Docker image. You can create such an image using the [CLI command](commands.md#release-commands) `create-docker-image` or run your tests once with an additional CLI option `--keep-dss-docker-image` to keep the image rather than removing it after the test session. + +Sample usage of the command `create-docker-image`: +```shell +poetry run exasol/ds/sandbox/main.py \ + create-docker-image \ + --version 9.9.9 \ + --log-level info +``` + +To use an existing docker image in the tests in `integration/test_create_dss_docker_image.py`, simply add the CLI option `--dss-docker-image` when calling `pytest`: ```shell poetry run pytest --dss-docker-image exasol/ai-lab:2.1.0 ``` -#### Executing tests involving AWS resources +## Tests for Jupyter Notebooks + +The AI-Lab also contains end-to-end tests for Jupyter notebooks. Executing these tests can take several hours, currently ~3h. + +The notebook tests are based on a common parameterized [test-runner](../../test/notebook_test_runner/test_notebooks_in_dss_docker_image.py). The test-runner contains a single parameterized test case on the outer level. Each time the test is executed, the test is parameterized with a Python file from the directory [test/notebooks](../../test/notebooks/) containing the particular test cases for one of the Jupyter notebooks. + +The outer test case then uses a session-scoped fixture for creating an ordinary AI-Lab Docker image. Another session-scoped fixture adds some packages for executing the notebook tests, resulting in a second Docker image. Finally, the test-runner launches a Docker container from the second image and runs the inner test cases for the current notebook inside the Docker container. + +In total, the following Docker entities are involved: +* Docker image 1 of the AI-Lab +* Docker image 2 for running the inner notebook tests +* Docker container running Docker image 2 -In AWS web interface, IAM create an access key for CLI usage and save or download the *access key id* and the *secret access key*. +### Speeding up Notebook Tests -In file `~/.aws/config` add lines +You can speed up the notebook tests using the [same strategy](#speeding-up-docker-based-tests) as for tests involving the basic Docker image for the AI-Lab. + +The CLI option to keep the image is `--keep-docker-image-notebook-test`, the option for using an existing Docker image for executing the notebook tests is `--docker-image-notebook-test`. + +```shell +poetry run pytest --docker-image-notebook-test +``` + +## Executing Tests Involving AWS Resources + +In the AWS web interface, IAM create an access key for CLI usage and save or download the *access key id* and the *secret access key*. + +In the file `~/.aws/config`, add lines: ``` [profile dss_aws_tests] region = eu-central-1 ``` -In file `~/.aws/credentials` add +In the file `~/.aws/credentials`, add: ``` [dss_aws_tests] @@ -33,19 +64,19 @@ aws_access_key_id=... aws_secret_access_key=... ``` -In case your are using MFA authentication please allocate a temporary token. +In case you are using MFA authentication, please allocate a temporary token. -After that you can set an environment variable and execute the tests involving AWS resources: +After that, you can set an environment variable and execute the tests involving AWS resources: ```shell export AWS_PROFILE=dss_aws_tests_mfa poetry run pytest test/test_deploy_codebuild.py ``` -#### Executing tests involving Ansible +## Executing Tests Involving Ansible -For making pytest display Ansible log messages, please use +To make pytest display Ansible log messages, please use: ```shell poetry run pytest -s -o log_cli=true -o log_cli_level=INFO -``` \ No newline at end of file +``` diff --git a/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/advanced.ipynb b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/advanced.ipynb new file mode 100644 index 00000000..0a57571d --- /dev/null +++ b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/advanced.ipynb @@ -0,0 +1,347 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "660ae5cd-3a39-486e-8edd-8f5bd2d23c7a", + "metadata": {}, + "source": [ + "# Advanced topics" + ] + }, + { + "cell_type": "markdown", + "id": "c1dc4885-42ca-47d6-800c-302b6cfc91d1", + "metadata": {}, + "source": [ + "This notebooks explains some details and background regarding the tool `exaslct`. This is especially useful when:\n", + "- you encounter a problem when running one of the other notebooks in this tutorial.\n", + "- you need to clean up disk space.\n", + "- you want to do more modifications to the script-languages-container than just adding a Python package." + ] + }, + { + "cell_type": "markdown", + "id": "64d867e5-b4c8-4c9c-b96a-e132d174ae1d", + "metadata": {}, + "source": [ + "## Setup\n", + "### Open Secure Configuration Storage" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "13b4908e-c1c6-4f15-879a-88f1e08a9e9b", + "metadata": {}, + "outputs": [], + "source": [ + "%run ../utils/access_store_ui.ipynb\n", + "display(get_access_store_ui('../'))" + ] + }, + { + "cell_type": "markdown", + "id": "1e70ffc8-b45e-40dc-85e5-ea3c986f7bbe", + "metadata": {}, + "source": [ + "### Instantiate SLCT Manager\n", + "\n", + "The \"Script-Languages-Container-Tools\" Manager (SLCT Manager) simplifies using the API of `exaslct`.\n", + "The following cell will therefore create an instance class `SlctManager` from the notebook-connector." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4040cd19-b5d8-4130-8f4a-cd99209c1607", + "metadata": {}, + "outputs": [], + "source": [ + "from exasol.nb_connector import slct_manager\n", + "slctmanager = slct_manager.SlctManager(ai_lab_config)" + ] + }, + { + "cell_type": "markdown", + "id": "dce45b1c-e018-4086-8c78-6bb12eae3ea6", + "metadata": {}, + "source": [ + "### Import some utility functions\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9cd529a7-bece-4e3e-8fb7-7c892e5e4306", + "metadata": {}, + "outputs": [], + "source": [ + "%run ./utils/file_system_ui.ipynb" + ] + }, + { + "cell_type": "markdown", + "id": "c2e0ba33-30b9-49e5-af42-ac01a4f204f6", + "metadata": {}, + "source": [ + "### Preparation \n", + "Before you start, run the export command (again), just to be sure to have all the artifacts (local files and docker images), which are necessary for this tutorial." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dd74b8e6-dfc8-44a4-a282-ebabb2d2c9a6", + "metadata": {}, + "outputs": [], + "source": [ + "slctmanager.export()" + ] + }, + { + "cell_type": "markdown", + "id": "29842f06-4d12-4a53-b9f6-23f16537a07a", + "metadata": {}, + "source": [ + "## What to do if something doesn't work?\n", + "\n", + "During the build, export or upload it can happen that external package repositories are not available or something is wrong on your machine running the build. For these cases, `exaslct` provides extensive log files that can help analyzing such problems.\n", + "\n", + "#### Exaslsct Log\n", + "The main log file for `exaslct` is stored as file `main.log` in the build output of the job. With the following command you can find the main logs for all previous executions." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fb2b363a-a6f3-4d4c-a1c3-16b64d6fa181", + "metadata": {}, + "outputs": [], + "source": [ + "main_logs = list(slctmanager.working_path.output_path.glob('**/main.log'))\n", + "show_files(main_logs)" + ] + }, + { + "cell_type": "markdown", + "id": "ce4be726-86ad-487c-9d76-705c360b825e", + "metadata": {}, + "source": [ + "The following command shows the log file of the last execution." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "38906878-66cc-43ed-b0df-f89ac30561d5", + "metadata": {}, + "outputs": [], + "source": [ + "tail_file(main_logs[0], 20)" + ] + }, + { + "cell_type": "markdown", + "id": "3ca13b0c-f8ee-4652-b053-053a1fedf4ed", + "metadata": {}, + "source": [ + "#### Build Output Directory\n", + "\n", + "More detailed information about the build or other operations can be found in directory `.build_output/jobs/*/outputs`. Here each run of `exaslct` creates its own subdirectory under `.build_output/jobs`. Directory `outputs` contains the outputs and log files (if any) produced by each of the executed tasks of `exaslct`. Especially, the Docker tasks such as build, pull and push store the logs returned by the Docker API. This can be helpful for analyzing problems during the build." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "101f6c7c-6e5e-466d-88c4-1e0b5e943502", + "metadata": {}, + "outputs": [], + "source": [ + "all_logs = list(slctmanager.working_path.output_path.glob('**/*.log'))\n", + "show_files(all_logs)" + ] + }, + { + "cell_type": "markdown", + "id": "dabaf9a7-12ef-4c0c-ba89-9b3b3dba981e", + "metadata": {}, + "source": [ + "\n", + "## Flavor Definition\n", + "The following diagram shows a high level overview of the build steps for a script languages container.\n", + "\n", + "Building an SLC usually starts with selecting one of the default build templates provided by Exsol.\n", + "These templates are called _flavors_. In this tutorial we customize the template Python flavor by adding new pip packages.\n", + "\n", + "\n", + "![image.png](slc_main_build_steps.svg)\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "The easiest way to customize a flavor is to add dependencies in the build step `flavor_customization`. Other build steps should only be changed with caution and is only required in special cases, e.g. when dependencies are defined in other build steps because the script client depends on these dependencies.\n", + "Check the directory structure of the selected flavor:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "29cd1109-aafe-47bf-9a70-db0bc3d76a9f", + "metadata": {}, + "outputs": [], + "source": [ + "show_directory_content(slctmanager.slc_dir.flavor_dir, 3)" + ] + }, + { + "cell_type": "markdown", + "id": "5eeacff5-e5e7-4b34-bb46-9359ae867baa", + "metadata": {}, + "source": [ + "For example, if you need additional apt packages, you can add those to the `template-Exasol-all-python-3.10/flavor_customization/packages/apt_get_packages` file." + ] + }, + { + "cell_type": "markdown", + "id": "63ccad71-a482-42cb-a7f0-0776036a99db", + "metadata": {}, + "source": [ + "## Build Cache\n", + "\n", + "`exaslct` internally uses a build cache in order to accelerate the build by re-using docker images, which were built during a previous execution.\n", + "Customizing a flavor always creates a separate entry in the cache with a unique hashcode and old containers don't get lost. If you revert your changes the system automatically uses the existing cached container. Below you can see the content of the cache directory for the containers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b3bfab04-008b-4577-84ab-01928e5b45aa", + "metadata": {}, + "outputs": [], + "source": [ + "show_directory_content(slctmanager.working_path.output_path / \"cache\" / \"exports\")" + ] + }, + { + "cell_type": "markdown", + "id": "0630c821-6012-4ba9-8cac-ea2717a69349", + "metadata": {}, + "source": [ + "`exaslct` also creates a docker image for each particular build step (see [Flavor Definition](#flavor_definition)), you can find the images with the following code:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "925be491-48cf-47ab-a1ff-fec17785e46f", + "metadata": {}, + "outputs": [], + "source": [ + "slctmanager.slc_docker_images" + ] + }, + { + "cell_type": "markdown", + "id": "b946027b-dfce-472c-bfd4-1f828b29011e", + "metadata": {}, + "source": [ + "Image `exasol/script-language-container:template-Exasol-all-python-3.10-release_...` is the final release image for the script-language-container, the other (intermediate) images are used for the caching mechnism." + ] + }, + { + "cell_type": "markdown", + "id": "e18cbe2a-18f0-4d1f-8a61-95e5951b7287", + "metadata": {}, + "source": [ + "## Cleanup\n", + "\n", + "This sections shows how you can use the `SLCT Manager` to clean up the artificats created by `exaslct`:\n", + "- The docker images for the script-languages-container as well as the cache images\n", + "- The exported containers (tar gz files)\n", + "- The cached container files\n" + ] + }, + { + "cell_type": "markdown", + "id": "542531a2-1772-4b1b-9517-59bfdce6f8cd", + "metadata": {}, + "source": [ + "### Clean up the docker images\n", + "\n", + "The following command cleans up the docker images:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0f1e8489-f8d5-4192-a77d-207e8daca238", + "metadata": {}, + "outputs": [], + "source": [ + "slctmanager.clean_all_images()" + ] + }, + { + "cell_type": "markdown", + "id": "bae173c6-4066-49ea-a732-e1f70b4c41cb", + "metadata": {}, + "source": [ + "### Clean up the local exported containers\n", + "\n", + "The following command cleans up the exported container files:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2317494b-bea7-45ed-a4b8-530af218542a", + "metadata": {}, + "outputs": [], + "source": [ + "slctmanager.working_path.cleanup_export_path()" + ] + }, + { + "cell_type": "markdown", + "id": "9d106732-a83e-48c8-b48d-084415a41b38", + "metadata": {}, + "source": [ + "### Clean up the output path\n", + "\n", + "Clean up the log files and caches:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8a95e3ae-aef2-4ec3-88e0-d64ddb880a40", + "metadata": {}, + "outputs": [], + "source": [ + "slctmanager.working_path.cleanup_output_path()" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/configure_slc_repository.ipynb b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/configure_slc_repository.ipynb new file mode 100644 index 00000000..0fd2b0fa --- /dev/null +++ b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/configure_slc_repository.ipynb @@ -0,0 +1,233 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "70724412-6577-4e69-86a1-b0d94e32eb96", + "metadata": {}, + "source": [ + "# Configure Tutorial Script Languages Container\n", + "\n", + "## Prerequisites\n", + "\n", + "Prior to using this notebook the following steps need to be completed:\n", + "1. [Configure the AI-Lab](../main_config.ipynb).\n", + "\n", + "### Open Secure Configuration Storage\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e70ad0a9-7042-4fe8-814b-5c586b9bee6d", + "metadata": {}, + "outputs": [], + "source": [ + "%run ../utils/access_store_ui.ipynb\n", + "display(get_access_store_ui('../'))" + ] + }, + { + "cell_type": "markdown", + "id": "6cdb53e1-3d9e-40af-a340-4ef224727fa0", + "metadata": {}, + "source": [ + "## Specific Configuration for this Tutorial\n", + "\n", + "For this tutorial we need the build definition of a Script-Languages container.\n", + "\n", + "You have two options:\n", + " - default: Clone the Exasol Script-Languages-Release Github repository.\n", + " - for advanced users: Use a custom path to an existing clone of the Exasol Script-Languages-Release Github repository.\n" + ] + }, + { + "cell_type": "markdown", + "id": "9f02bc2f-411f-4479-b2e2-1afef96e4bd8", + "metadata": {}, + "source": [ + "### Load UI functions\n", + "Let's import some additional UI functions in order to use them in this notebook." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "224345da-f14e-4a45-bf8a-07e3ba0870dc", + "metadata": {}, + "outputs": [], + "source": [ + "%run ./utils/slc_ui.ipynb" + ] + }, + { + "cell_type": "markdown", + "id": "b81b0103-1014-4bc9-8724-0c0bafa83a23", + "metadata": {}, + "source": [ + "### Check that we don't have a SaaS backend configured" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "95d1e0e1-e334-4154-91cd-765d78868eaa", + "metadata": {}, + "outputs": [], + "source": [ + "from exasol.nb_connector.ai_lab_config import StorageBackend\n", + "if ai_lab_config.get(AILabConfig.storage_backend, ) == StorageBackend.saas.name:\n", + " popup_message(f\"This tutorial will not work correctly with a SaaS as backend. You can export the Script-Languages-Container to a local file, but the upload to the database will fail.\")\n" + ] + }, + { + "cell_type": "markdown", + "id": "395d6f4f-6e06-45e3-a9a7-54dc94e73332", + "metadata": {}, + "source": [ + "### Instantiate SLCT Manager\n", + "\n", + "The \"Script-Languages-Container-Tools\" Manager (SLCT Manager) simplifies using the API of `exaslct`.\n", + "The following cell will therefore create an instance class `SlctManager` from the notebook-connector." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9f42c7c6-0297-4cb4-bedf-d3a2a58e69fc", + "metadata": {}, + "outputs": [], + "source": [ + "from exasol.nb_connector.slct_manager import SlctManager\n", + "slct_manager = SlctManager(ai_lab_config)" + ] + }, + { + "cell_type": "markdown", + "id": "178b59e9-8f46-43b1-b4f8-9cd6d6bf5076", + "metadata": {}, + "source": [ + "### Configure the Script-Languages directory\n", + "#### Choose the source" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0f8caa77-35a0-403d-af96-e5f8056ba489", + "metadata": {}, + "outputs": [], + "source": [ + "display(get_slc_source_selection_ui(ai_lab_config))" + ] + }, + { + "cell_type": "markdown", + "id": "fc0fe1a5-8861-4f18-8bcf-92600f86f33f", + "metadata": {}, + "source": [ + "### Use existing script-languages-repository\n", + "If you chose to use an existing script-languages-repository, then simply select the path." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "56e16540-5f4b-4d5c-8c83-d34c1217eb00", + "metadata": {}, + "outputs": [], + "source": [ + "display(get_existing_slc_ui(ai_lab_config))" + ] + }, + { + "cell_type": "markdown", + "id": "1db0f115-9382-4e53-be08-53586b831d0b", + "metadata": {}, + "source": [ + "### Clone the Script-Languages-Release repository\n", + "If you chose to clone the Exasol script-languages-repository, then first select the root path where the repository should be stored." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7d84889f-7c3a-4f69-9bf1-c193481867af", + "metadata": {}, + "outputs": [], + "source": [ + "display(get_slc_target_dir_ui(ai_lab_config))" + ] + }, + { + "cell_type": "markdown", + "id": "a02ee33f-1958-4f9c-a796-01c6bc2a8787", + "metadata": {}, + "source": [ + "#### Clone the repository" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "31132726-9827-4c9b-8fb8-0884d2f36e67", + "metadata": {}, + "outputs": [], + "source": [ + "if clone_slc_repo(ai_lab_config):\n", + " slct_manager.clone_slc_repo()\n", + "print(\"Ready\")" + ] + }, + { + "cell_type": "markdown", + "id": "53997bff-b4be-406f-b54f-ae9f39939ede", + "metadata": {}, + "source": [ + "#### Verify that the required flavor for the tutorial is present" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bb6b26b9-cb24-4b66-85c9-e15a0c92ff2c", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "if not slct_manager.check_slc_repo_complete():\n", + " popup_message(f\"The script-languages repository does not fullfill requirements.\")\n" + ] + }, + { + "cell_type": "markdown", + "id": "21840146-28bd-413f-a9ce-cd22b17939b2", + "metadata": {}, + "source": [ + "## Finish\n", + "Now you can continue with [Using the script-languages-container tool](./using_the_script_languages_container_tool.ipynb) " + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/customize.ipynb b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/customize.ipynb new file mode 100644 index 00000000..b43c3f30 --- /dev/null +++ b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/customize.ipynb @@ -0,0 +1,300 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "a467374a-b083-4970-9810-57eb528141fe", + "metadata": {}, + "source": [ + "# Customize a flavor\n", + "\n", + "Sometimes you need very specific dependencies or versions of dependencies in the Exasol UDFs. In such case you can customize a Script-Language Container.\n", + "You find additional information in the [Exasol official documentation](https://docs.exasol.com/db/latest/database_concepts/udf_scripts/adding_new_packages_script_languages.htm#)." + ] + }, + { + "cell_type": "markdown", + "id": "b61aa708-9383-4e5b-b072-b50492604f9c", + "metadata": {}, + "source": [ + "## Setup\n", + "### Open Secure Configuration Storage" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "afd25eb3-a320-4375-8f5c-07ade762f28f", + "metadata": {}, + "outputs": [], + "source": [ + "%run ../utils/access_store_ui.ipynb\n", + "display(get_access_store_ui('../'))" + ] + }, + { + "cell_type": "markdown", + "id": "8ef77c92-a796-4b02-9747-f711dca9a9b6", + "metadata": {}, + "source": [ + "### Instantiate SLCT Manager\n", + "\n", + "The \"Script-Languages-Container-Tools\" Manager (SLCT Manager) simplifies using the API of `exaslct`.\n", + "The following cell will therefore create an instance class `SlctManager` from the notebook-connector." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "20c8c4cc-bc8a-45e8-b9df-b71021fd9476", + "metadata": {}, + "outputs": [], + "source": [ + "from exasol.nb_connector import slct_manager\n", + "slctmanager = slct_manager.SlctManager(ai_lab_config)" + ] + }, + { + "cell_type": "markdown", + "id": "264f7b58-093a-41db-8f09-c848896a1318", + "metadata": {}, + "source": [ + "### Import some utility functions" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "79e9843d-d436-4f97-b20e-56d2a42afdaa", + "metadata": {}, + "outputs": [], + "source": [ + "%run ./utils/file_system_ui.ipynb\n", + "%run ./utils/slc_ui.ipynb" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "b632c4b5-3bb9-46c1-8c68-009102432a71", + "metadata": {}, + "source": [ + "## Customize\n", + "\n", + "First you need to define an alias for the new SLC. The alias will be used to reference the container later from the UDFs.\n", + "\n", + "Note: In this tutorial the alias also will be used as part of the export file (tar.gz) and the uploaded container to the BucketFS. This allows you to create, upload and use different script-language-containers: one per alias." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a88f47d9-7337-45fb-b2d4-08b74bc95f5f", + "metadata": {}, + "outputs": [], + "source": [ + "display(get_alias_ui(ai_lab_config, \"ai_lab_default_custom\"))" + ] + }, + { + "cell_type": "markdown", + "id": "410312a2-0976-423a-bdf5-2100a0085528", + "metadata": {}, + "source": [ + "### Flavor Customization Build Step\n", + "\n", + "`exasclt` consists of multiple build steps. By a build step here we mean a file structure which serves as an input for a certain stage of the building process of the script-languages-container. See [Advanced Topics](./advanced.ipynb) for more details.\n", + "\n", + "Build step **flavor_customization** is defined by a Dockerfile and several package lists. We recommend to add new packages to the package lists and only modify the Dockerfile if you need very specific changes, like adding additional resources." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ed0ad229-06b6-4083-8daa-a23e176b59d4", + "metadata": {}, + "outputs": [], + "source": [ + "show_directory_content(slctmanager.slc_dir.flavor_dir / \"flavor_customization\")" + ] + }, + { + "cell_type": "markdown", + "id": "8b0f6884-8fda-44d1-92a4-ae23040b109c", + "metadata": {}, + "source": [ + "The Dockerfile consists of two parts. The first part installs the packages from the package lists and should only be changed with care. The second part is free for your changes. Read the description in the Dockerfile carefully to find out what you can and shouldn't do." + ] + }, + { + "cell_type": "markdown", + "id": "65cc2597-cb90-494c-af6e-e8a7c51f4130", + "metadata": {}, + "source": [ + "#### Package Lists\n", + "The package lists have a unified format. Each line consists of the package name and the package version separated by the pipe character `|`, e.g `xgboost|1.3.3`. You can comment out a whole line by adding a hash character `#` the beginning of the line. You can also add a trailing comment to a package definition by adding `#` after the package definition. We usually recommend to install a specific package version to avoid surprises about which version actually gets installed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e9720186-a4b2-43f3-81d5-3f96adc48e60", + "metadata": {}, + "outputs": [], + "source": [ + "show_files([slctmanager.slc_dir.custom_pip_file])" + ] + }, + { + "cell_type": "markdown", + "id": "be4ac7a0-b7de-4168-b177-9b728f4d9a30", + "metadata": {}, + "source": [ + "We are now going to append Python package \"xgboost\" to one of the package lists by adding `xgboost|2.0.3` and `scikit-learn|1.5.0` to file `flavor_customization/packages/python3_pip_packages`. \n", + "Notes:\n", + " - running the following command multiple times will iteratively append the packages\n", + " - you can also click on the link and modify the file directly" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6b1ebb00-3ba2-4bbc-b923-f9a7d66d5a1f", + "metadata": {}, + "outputs": [], + "source": [ + "xgboost_pkg = slct_manager.PipPackageDefinition(pkg=\"xgboost\", version=\"2.0.3\")\n", + "scikit_learn_pkg = slct_manager.PipPackageDefinition(pkg=\"scikit-learn\", version=\"1.5.0\")\n", + "slctmanager.append_custom_packages([xgboost_pkg, scikit_learn_pkg])\n", + "show_files([slctmanager.slc_dir.custom_pip_file])" + ] + }, + { + "cell_type": "markdown", + "id": "a841d351-2bc3-49c1-89e3-d84c265fbf69", + "metadata": {}, + "source": [ + "#### Rebuilding the customized Flavor\n", + "\n", + "After changing the flavor you need to rebuild it. You can do it by running `export` again. Exaslct automatically recognizes that the flavor has changed and builds a new version of the container. Don't get confused by the warnings: `exaslct` first tries to find the cached docker images (see [Advanced Topics](./advanced.ipynb)), but as the content has changed, the cached image is not available, and the docker service returns a 404 error message." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "06e8ee1f-7121-4e78-8a7f-35951bf685df", + "metadata": {}, + "outputs": [], + "source": [ + "slctmanager.export()" + ] + }, + { + "cell_type": "markdown", + "id": "c58b9172-e607-437a-9568-4063a88d446f", + "metadata": {}, + "source": [ + "Lets check the resulting tar gz file:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "aabe7634-6d47-40c8-a363-4f53bf8702f3", + "metadata": {}, + "outputs": [], + "source": [ + "show_directory_content(slctmanager.working_path.export_path)" + ] + }, + { + "cell_type": "markdown", + "id": "fc792fc1-5591-4449-879e-8c2866e2a21c", + "metadata": {}, + "source": [ + "#### Upload the Container to the Database\n", + "To use the new container you need to upload it to the BucketFS. If the build machine has access to the BucketFS you do it with the `exaslct` upload command, as shown below. Otherwise you need to install the script-languages-container manually: \n", + "1. Transfer the container tar gz file of the previous step to a machine that has access to the BucketFS. \n", + "2. From that machine upload it via curl, as described in our [documentation](https://docs.exasol.com/db/latest/database_concepts/udf_scripts/adding_new_packages_script_languages.htm).\n", + "\n", + "With the following command you upload the new script language container.\n", + "You could run the same on the command line with the `exaslct` tool:\n", + "```\n", + "cd \n", + "./exaslct upload --flavor-path flavors/ --database-host --bucketfs-port --bucketfs-username --bucketfs-password --bucketfs-name --bucket-name \n", + "```\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4e4f1c13-2d80-4a90-9dad-6883ec92dcfe", + "metadata": {}, + "outputs": [], + "source": [ + "slctmanager.upload()" + ] + }, + { + "cell_type": "markdown", + "id": "679730d4-820c-4547-8098-9885818cb4e8", + "metadata": {}, + "source": [ + "This command also stores the activation statement in the ai-lab-config. You can verify it with:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dc2863fc-791c-4035-807b-62abcc705753", + "metadata": {}, + "outputs": [], + "source": [ + "slctmanager.activation_key" + ] + }, + { + "cell_type": "markdown", + "id": "8797d3e3-a43e-4250-a5f0-50fb04aa1514", + "metadata": {}, + "source": [ + "The syntax of the activation statement is: `alias=url`. The activation key will be used in the `ALTER_SESSION` or `ALTER_SYSTEM` commands to \"register\" the script-language-container for usage in the UDFs.\n", + "\n", + "You can generate the SQL commands for the activation with the following command line:\n", + "```\n", + "cd \n", + "./exaslct generate-language-activation --flavor-path flavors/ --bucketfs-name --bucket-name --container-name --path-in-bucket \n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "15ffe9e0-3598-44bc-b66d-9f4b9104da75", + "metadata": {}, + "source": [ + "You can now continue [testing the uploaded container](./test_slc.ipynb)." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/export_as_is.ipynb b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/export_as_is.ipynb new file mode 100644 index 00000000..d62ccff1 --- /dev/null +++ b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/export_as_is.ipynb @@ -0,0 +1,182 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "14dad93e-8ac8-45ce-bec1-51b3dfa15e44", + "metadata": {}, + "source": [ + "# Export the flavor as is\n", + "\n", + "Exasol [User Defined Functions](https://docs.exasol.com/db/latest/database_concepts/udf_scripts.htm) (UDFs) enable embedding user code into SQL statements. Each Python UDF runs in a so-called [Script-Languages-Container](https://docs.exasol.com/db/latest/database_concepts/udf_scripts/adding_new_packages_script_languages.htm) (SLC). Exasol provides default SLCs with some preinstalled PYthon packages but also allows users to create their own SLCs, e.g. by adding additional dependencies.\n", + "\n", + "This tutorial shows how to build a Script-Languages-Container (SLC) from a base flavor (without any modification) and write the result to a tar gz file. The base flavor `template-Exasol-all-python-3.10` is part of the Script-Languages Release repository.\n", + "\n", + "`exaslct` uses the flavor description to build a Docker image which is called the `release` image. `exaslct` can export the content of this Docker image then to a tar gz file.\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "86b1ced5-cd62-43b7-a34a-3af6d247f965", + "metadata": {}, + "source": [ + "## Setup\n", + "### Open Secure Configuration Storage" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e3eaedcf-b977-4855-8caf-d638acbbbf60", + "metadata": {}, + "outputs": [], + "source": [ + "%run ../utils/access_store_ui.ipynb\n", + "display(get_access_store_ui('../'))" + ] + }, + { + "cell_type": "markdown", + "id": "be1d76f3-51c5-4b92-812f-bac74ac0e6c3", + "metadata": {}, + "source": [ + "### Instantiate SLCT Manager\n", + "\n", + "The \"Script-Languages-Container-Tools\" Manager (SLCT Manager) simplifies using the API of `exaslct`.\n", + "The following cell will therefore create an instance class `SlctManager` from the notebook-connector." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fe86cddb-fa94-4a74-8c7e-8d27f32d7228", + "metadata": {}, + "outputs": [], + "source": [ + "from exasol.nb_connector import slct_manager\n", + "slctmanager = slct_manager.SlctManager(ai_lab_config)" + ] + }, + { + "cell_type": "markdown", + "id": "31029b4f-df19-464b-91c7-2981dee02d12", + "metadata": {}, + "source": [ + "### Import Some utility functions\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c91a7e32-4b54-45d6-a93d-579f103e96b0", + "metadata": {}, + "outputs": [], + "source": [ + "%run ./utils/file_system_ui.ipynb" + ] + }, + { + "cell_type": "markdown", + "id": "6c1cc4a6-4176-4e52-9591-42966a9e2ab0", + "metadata": {}, + "source": [ + "## Export\n" + ] + }, + { + "cell_type": "markdown", + "id": "40db107e-3253-416d-bbc4-5586105f8ea6", + "metadata": {}, + "source": [ + "Currently used flavor is:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ab400c06-1fb9-40dd-a8bc-cda4c56905d1", + "metadata": {}, + "outputs": [], + "source": [ + "slct_manager.REQUIRED_FLAVOR" + ] + }, + { + "cell_type": "markdown", + "id": "33f6b43e-9801-49a8-ac70-ce018be5bfba", + "metadata": {}, + "source": [ + "### Export the flavor\n", + "\n", + "Now execute the `export` step. The command builds the docker image and exports the Docker image to the export directory.\n", + "\n", + "You could run the same on the command line with the `exaslct` tool:\n", + "```\n", + "cd \n", + "./exaslct export --flavor-path flavors/\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "46b1b556-2ea2-476d-b67f-aef76db970ea", + "metadata": {}, + "outputs": [], + "source": [ + "slctmanager.export()" + ] + }, + { + "cell_type": "markdown", + "id": "fe0efd52-fb2c-4a8c-8712-0cefdc92c044", + "metadata": {}, + "source": [ + "#### Check the result\n", + "The following command will show the resulting files of the export command:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b941a044-25ed-4a51-9a86-52d4b011defa", + "metadata": {}, + "outputs": [], + "source": [ + "show_directory_content(slctmanager.working_path.export_path)" + ] + }, + { + "cell_type": "markdown", + "id": "826ac429-f65a-4f9a-98e7-a63f79a0a5fe", + "metadata": {}, + "source": [ + "Hint: If you want to download the tar gz file, you can do this in the Jupyter Project View on the left side.\n", + "\n", + "The resulting tar gz can then be uploaded to BucketFS. You will learn how to do this in the next lesson:\n", + "[Customize the flavor, export and upload the script-language-container](./customize.ipynb)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/slc_main_build_steps.svg b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/slc_main_build_steps.svg new file mode 100644 index 00000000..77d88309 --- /dev/null +++ b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/slc_main_build_steps.svg @@ -0,0 +1,3 @@ + + +
udfclient_deps
udfclient_deps
language_deps
language_deps
build_run
build_run
flavor_base_deps
flavor_base_deps
flavor_customization
flavor_customization
release
release
Viewer does not support full SVG 1.1
\ No newline at end of file diff --git a/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/test_slc.ipynb b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/test_slc.ipynb new file mode 100644 index 00000000..d8770d85 --- /dev/null +++ b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/test_slc.ipynb @@ -0,0 +1,324 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "a959fe32-bf73-47dc-9e22-40cc3ea96db8", + "metadata": {}, + "source": [ + "# Test the new Script-Languages-Container\n", + "\n", + "This notebooks shows how to:\n", + "- activate the new script-languages-container in the Exasol database\n", + "- create UDFs for the new script-languages-container\n", + "- run those UDFs" + ] + }, + { + "cell_type": "markdown", + "id": "379f0b9c-3b7f-4c3e-aa0f-6a9fe86cd02b", + "metadata": {}, + "source": [ + "## Setup\n", + "### Open Secure Configuration Storage" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3543a7c3-43f7-4f54-a91d-89a9c6229c49", + "metadata": {}, + "outputs": [], + "source": [ + "%run ../utils/access_store_ui.ipynb\n", + "display(get_access_store_ui('../'))" + ] + }, + { + "cell_type": "markdown", + "id": "9a4f44ef-5c9f-4345-b674-f400adeec03e", + "metadata": {}, + "source": [ + "### Instantiate SLCT Manager\n", + "\n", + "The \"Script-Languages-Container-Tools\" Manager (SLCT Manager) simplifies using the API of `exaslct`.\n", + "The following cell will therefore create an instance class `SlctManager` from the notebook-connector." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b4afd420-ff1d-4c67-af2f-083af15d8623", + "metadata": {}, + "outputs": [], + "source": [ + "from exasol.nb_connector import slct_manager\n", + "slctmanager = slct_manager.SlctManager(ai_lab_config)" + ] + }, + { + "cell_type": "markdown", + "id": "4631cf8b-025a-4571-a7b7-b0d1ac80dc45", + "metadata": {}, + "source": [ + "## Use the new Script-Languages-Container\n", + "\n", + "### Connect to the database and activate the container\n", + "Once you have a connection to the database you can run either the ALTER SESSION statement or ALTER SYSTEM statement. The latter statement will activate the container permanently and globally.\n", + "The `notebook` connector package provides a utility method, for creating an `pyexasol` connection and applying the `ALTER SESSION` command for all registered languages:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "071aad1e-1cbf-407f-bae5-b8b000667867", + "metadata": {}, + "outputs": [], + "source": [ + "from exasol.nb_connector.language_container_activation import open_pyexasol_connection_with_lang_definitions\n", + "\n", + "conn = open_pyexasol_connection_with_lang_definitions(ai_lab_config, schema=ai_lab_config.db_schema, compression=True)" + ] + }, + { + "cell_type": "markdown", + "id": "688d5eab-005b-4780-a239-4d0f3e0da191", + "metadata": {}, + "source": [ + "### Check if your customization did work\n", + "\n", + "You first create a helper UDF which allows you to run arbitrary shell commands inside of a UDF instance. With that you can easily inspect the container." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bb303818-8b5b-4d12-b117-f398a249264f", + "metadata": {}, + "outputs": [], + "source": [ + "import textwrap\n", + "\n", + "conn.execute(textwrap.dedent(f\"\"\"\n", + "CREATE OR REPLACE {slctmanager.language_alias} SCALAR SCRIPT execute_shell_command_py3(command VARCHAR(2000000), split_output boolean)\n", + "EMITS (lines VARCHAR(2000000)) AS\n", + "import subprocess\n", + "\n", + "def run(ctx):\n", + " try:\n", + " p = subprocess.Popen(ctx.command,\n", + " stdout = subprocess.PIPE,\n", + " stderr = subprocess.STDOUT,\n", + " close_fds = True,\n", + " shell = True)\n", + " out, err = p.communicate()\n", + " if isinstance(out,bytes):\n", + " out=out.decode('utf8')\n", + " if ctx.split_output:\n", + " for line in out.strip().split('\\\\n'):\n", + " ctx.emit(line)\n", + " else:\n", + " ctx.emit(out)\n", + " finally:\n", + " if p is not None:\n", + " try: p.kill()\n", + " except: pass\n", + "/\n", + "\"\"\"))" + ] + }, + { + "cell_type": "markdown", + "id": "4c727768-193a-4be0-bd9e-f2d5c6ebd484", + "metadata": {}, + "source": [ + "Check with \"pip list\" if the \"xgboost\" package is installed\n", + "We use our helper UDF to run `python3 -m pip list` directly in the container and get the list of currently available python3 packages." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7b09e840-e5c6-4490-a6cf-0f640c2a3a83", + "metadata": {}, + "outputs": [], + "source": [ + "rs=conn.execute(\"\"\"select execute_shell_command_py3('python3 -m pip list', true)\"\"\")\n", + "for r in rs: \n", + " print(r[0])" + ] + }, + { + "cell_type": "markdown", + "id": "d321eb44-ddc0-43be-bfeb-556c6a524c4a", + "metadata": {}, + "source": [ + "Running `pip list` inside the container displays the available packages. In case of unexpected results, please have a look at the information stored by `exaslct` during build-time inside the container.\n", + "\n", + "#### Embedded Build Information of the Container\n", + "Here we see an overview about the build information which `exaslct` embedded into the container. `exaslct` stores all packages lists (as defined in the flavor and what actually got installed), the final Dockerfiles and the image info. The image info describes how the underlying Docker images of the container got built. The build information is stored in the `/build_info` directory in the container.\n", + "\n", + "This command will show an overview of the build information:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "771caaaa-10af-4179-ad30-d76ea49de069", + "metadata": {}, + "outputs": [], + "source": [ + "rs=conn.execute(\"\"\"select execute_shell_command_py3('find /build_info', true)\"\"\")\n", + "for r in rs: \n", + " print(r[0])" + ] + }, + { + "cell_type": "markdown", + "id": "a29627f2-c01d-45f4-ab0b-b71736d432dd", + "metadata": {}, + "source": [ + "Now you can examine the python3 pip packages file, which was created directly after building the container image by `exaslct`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7ebe46fc-80dc-43c3-a803-833d9786dab7", + "metadata": {}, + "outputs": [], + "source": [ + "rs=conn.execute(\"\"\"select execute_shell_command_py3('cat /build_info/actual_installed_packages/release/python3_pip_packages', true)\"\"\")\n", + "for r in rs: \n", + " print(r[0])" + ] + }, + { + "cell_type": "markdown", + "id": "4d52adf9-5630-4612-90a1-23ed21299771", + "metadata": {}, + "source": [ + "All your packages from the flavor-customization build step should be included. If you want to double check this, you can run:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c9d76165-e775-46c0-ab87-e0f14aa36f51", + "metadata": {}, + "outputs": [], + "source": [ + "rs=conn.execute(\"\"\"select execute_shell_command_py3('cat /build_info/packages/flavor_customization/python3_pip_packages', true)\"\"\")\n", + "for r in rs:\n", + " if r[0] is None:\n", + " print()\n", + " else:\n", + " print(r[0])" + ] + }, + { + "cell_type": "markdown", + "id": "02b5ea6b-a79e-4223-9eb9-a9c01ddacf04", + "metadata": {}, + "source": [ + "### Testing the new package\n", + "\n", + "After you made sure that the required packages are installed, you need to try importing and using them. Importing is usually a good first test if a package got successfully installed, because often you might already get errors at this step. However, sometimes you only will recognize errors when using the package. We recommend to have a test suite for each new package to check if it works properly before you start your UDF development. It is usually easier to debug problems if you have very narrow tests." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "613335a7-c716-4c2a-aee4-c15269a9d74a", + "metadata": {}, + "outputs": [], + "source": [ + "conn.execute(textwrap.dedent(f\"\"\"\n", + "CREATE OR REPLACE {slctmanager.language_alias} SET SCRIPT test_xgboost(i integer)\n", + "EMITS (o VARCHAR(2000000)) AS\n", + "\n", + "def run(ctx):\n", + " import xgboost\n", + " import sklearn \n", + " \n", + " ctx.emit(\"success\")\n", + "/\n", + "\"\"\"))\n", + "\n", + "rs = conn.execute(\"select test_xgboost(1)\")\n", + "rs.fetchall()" + ] + }, + { + "cell_type": "markdown", + "id": "43bdfaa2", + "metadata": {}, + "source": [ + "Finally, import and use the new packages. The following UDF uses the `xgboost` and `sklearn` modules to solve a small machine learning problem." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c476f9a2-38c5-4b1e-8fa6-8a2f07e928a6", + "metadata": {}, + "outputs": [], + "source": [ + "conn.execute(textwrap.dedent(f\"\"\"\n", + "CREATE OR REPLACE {slctmanager.language_alias} SET SCRIPT test_xgboost(i integer)\n", + "EMITS (o1 DOUbLE, o2 DOUbLE, o3 DOUbLE) AS\n", + "\n", + "def run(ctx):\n", + " import pandas as pd\n", + " import xgboost as xgb\n", + " from sklearn import datasets\n", + " from sklearn.model_selection import train_test_split\n", + " \n", + " iris = datasets.load_iris()\n", + " X = iris.data\n", + " y = iris.target\n", + " \n", + " X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n", + " \n", + " dtrain = xgb.DMatrix(X_train, label=y_train)\n", + " dtest = xgb.DMatrix(X_test, label=y_test)\n", + " param = {{\n", + " 'max_depth': 3, # the maximum depth of each tree\n", + " 'eta': 0.3, # the training step for each iteration\n", + " 'silent': 1, # logging mode - quiet\n", + " 'objective': 'multi:softprob', # error evaluation for multiclass training\n", + " 'num_class': 3 # the number of classes that exist in this datset\n", + " }}\n", + " num_round = 20 # the number of training iterations\n", + " bst = xgb.train(param, dtrain, num_round)\n", + " preds = bst.predict(dtest)\n", + " \n", + " ctx.emit(pd.DataFrame(preds))\n", + "/\n", + "\"\"\"))\n", + "\n", + "conn.export_to_pandas(\"select test_xgboost(1)\")" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/using_the_script_languages_container_tool.ipynb b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/using_the_script_languages_container_tool.ipynb new file mode 100644 index 00000000..a4f1475e --- /dev/null +++ b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/using_the_script_languages_container_tool.ipynb @@ -0,0 +1,60 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "06585243-2e50-4f9b-8d6f-b79aade6f8f5", + "metadata": {}, + "source": [ + "# Using the Script-Languages Container Tool\n", + "A [Script-Language Container](https://github.com/exasol/script-languages-release)(SLC) for the Exasol database consists of a Linux container with a complete Linux distribution and all required libraries, such as a script client. The script client is responsible for the communication with the database and for executing the script code. It allows to also include user specific libraries which can then be used from within the UDFs.\n", + "\n", + "**Note: This tutorial currently does not support a SaaS backend.**" + ] + }, + { + "cell_type": "markdown", + "id": "39333102-7652-4699-ba0e-d184aeec1753", + "metadata": {}, + "source": [ + "Before we start we need to configure the script-languages directory and flavor. See [Configure SLC repository](./configure_slc_repository.ipynb)" + ] + }, + { + "cell_type": "markdown", + "id": "c88668d2-83e5-4812-a05c-936912068f03", + "metadata": {}, + "source": [ + "\n", + "This tutorial uses the `template-Exasol-all-python-3.10` flavor as the base. In short words, a flavor is a recipe for building a script-languages-container. \n", + "Under the hood, the `exaslct` tool is used, see [Script-Language Container Tool](https://github.com/exasol/script-languages-container-tool) for details. The `exaslct` tool can be used to build, test and upload script-languages-container for an Exasol database.\n", + "You will learn how to:\n", + "- [Export a script-languages-container from the flavor](./export_as_is.ipynb)\n", + "- [Customize the flavor, export and upload the script-language-container](./customize.ipynb)\n", + "- [Test the uploaded script-languages-container](./test_slc.ipynb)\n", + "\n", + "Additionaly, there is the [Advanced Topics](./advanced.ipynb) tutorial which provides more detailed information." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/utils/file_system_ui.ipynb b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/utils/file_system_ui.ipynb new file mode 100644 index 00000000..d9a4bb9a --- /dev/null +++ b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/utils/file_system_ui.ipynb @@ -0,0 +1,65 @@ +{ + "cells": [ + { + "cell_type": "code", + "execution_count": null, + "id": "758126d0-4421-43da-b681-21ae4e37b571", + "metadata": {}, + "outputs": [], + "source": [ + "from pathlib import Path\n", + "from typing import List\n", + "from IPython.display import Code, display, FileLink, FileLinks\n", + "import os\n", + "\n", + "import os\n", + "import contextlib\n", + "from pathlib import Path\n", + "\n", + "def show_directory_content(p: Path, max_depth: int = 1):\n", + " for path in p.iterdir():\n", + " if path.is_file():\n", + " display(FileLink(str(os.path.relpath(path))))\n", + " if path.is_dir() and max_depth > 1:\n", + " show_directory_content(path, max_depth - 1)\n", + "\n", + "def show_files(paths: List[Path]):\n", + " for path in paths:\n", + " if path.is_file():\n", + " display(FileLink(str(os.path.relpath(path))))\n", + "\n", + "def tail_file(path: Path, length: int):\n", + " with open(path) as f:\n", + " lines = f.readlines()\n", + " print(\"\".join(lines[-length:]))\n", + "\n", + "def show_docker_file(path: Path):\n", + " display(Code(filename=str(path), language=\"Docker\"))\n", + "\n", + "def show_pip_file(path: Path):\n", + " display(Code(filename=str(path), language=\"toml\"))" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/utils/slc_ui.ipynb b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/utils/slc_ui.ipynb new file mode 100644 index 00000000..764ba10f --- /dev/null +++ b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/script_languages_container/utils/slc_ui.ipynb @@ -0,0 +1,215 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "2761e16d-8723-4f44-b202-d772ce751323", + "metadata": {}, + "source": [ + "# Script-Languages-Container UI\n", + "\n", + "This notebook is not supposed to be used on its own.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f044043c-8fa1-442e-8d2c-b4221accff9f", + "metadata": {}, + "outputs": [], + "source": [ + "from exasol.nb_connector.utils import upward_file_search\n", + "\n", + "# This NB may be running from various locations in the NB hierarchy.\n", + "# Need to search for the styles NB from the current directory upwards.\n", + "\n", + "%run {upward_file_search('utils/ui_styles.ipynb')}\n", + "%run {upward_file_search('utils/popup_message_ui.ipynb')}\n", + "%run {upward_file_search('utils/generic_config_ui.ipynb')}" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a2e90ebe-37d6-43e6-9677-dace6c06d942", + "metadata": {}, + "outputs": [], + "source": [ + "from exasol.nb_connector.ai_lab_config import AILabConfig\n", + "\n", + "slc_source_slc_repo_option_value = \"Clone script languages release repository\"\n", + "\n", + "from ipyfilechooser import FileChooser\n", + "from pathlib import Path\n", + "from exasol.nb_connector.secret_store import Secrets\n", + "\n", + "\n", + "NOTEBOOKS_DIRECTORY = str(Path().parent.parent)\n", + "\n", + "def clone_slc_repo(conf: Secrets) -> bool:\n", + " return conf.get(AILabConfig.slc_source) == slc_source_slc_repo_option_value\n", + "\n", + "def get_slc_source_selection_ui(conf: Secrets) -> widgets.Widget:\n", + " \"\"\"\n", + " Creates a UI form for choosing the flavor source\n", + " \"\"\"\n", + "\n", + " ui_look = get_config_styles()\n", + " ui_look.input_layout.max_width = \"500px\"\n", + " \n", + " slc_sources = [slc_source_slc_repo_option_value, \"Use the existing clone\"]\n", + " default_value = conf.get(AILabConfig.slc_source, slc_source_slc_repo_option_value)\n", + "\n", + " flavor_source_selector = widgets.RadioButtons(options=slc_sources, value=default_value, \n", + " layout=ui_look.input_layout, style=ui_look.input_style)\n", + " select_btn = widgets.Button(description='Select', style=ui_look.button_style, layout=ui_look.button_layout)\n", + " header_lbl = widgets.Label(value='Flavor choice', style=ui_look.header_style, layout=ui_look.header_layout)\n", + "\n", + "\n", + " def select_flavor_source(btn):\n", + " conf.save(AILabConfig.slc_source, flavor_source_selector.value)\n", + " btn.icon = 'check'\n", + "\n", + " def on_value_change(change):\n", + " select_btn.icon = 'pen'\n", + "\n", + " select_btn.on_click(select_flavor_source)\n", + " flavor_source_selector.observe(on_value_change, names=['value'])\n", + "\n", + " group_items = [header_lbl, widgets.Box([flavor_source_selector], layout=ui_look.row_layout)]\n", + " items = [widgets.Box(group_items, layout=ui_look.group_layout), select_btn]\n", + " ui = widgets.Box(items, layout=ui_look.outer_layout)\n", + " return ui\n", + "\n", + "def get_slc_target_dir_ui(conf: Secrets) -> widgets.Widget:\n", + " \"\"\"\n", + " Creates a UI form for editing the Script-Languages-Container repository configuration.\n", + " \"\"\"\n", + " default_target_dir = NOTEBOOKS_DIRECTORY\n", + " target_dir_chooser_widget = FileChooser(path=default_target_dir, select_default=True)\n", + " target_dir_chooser_widget.show_only_dirs = True\n", + " target_dir_chooser_widget.sandbox_path = NOTEBOOKS_DIRECTORY\n", + " \n", + " inputs = [\n", + " [\n", + " ('Target Base Directory', target_dir_chooser_widget, AILabConfig.slc_target_dir),\n", + " ]\n", + " ]\n", + "\n", + " ui_look = get_config_styles()\n", + " ui_look.row_layout.max_width = \"500px\"\n", + " ui_look.group_layout.max_width = \"500px\"\n", + " save_btn = widgets.Button(description='Save', style=ui_look.button_style, layout=ui_look.button_layout)\n", + " header_lbl = widgets.Label(value='Target Directory', style=ui_look.header_style, layout=ui_look.header_layout)\n", + "\n", + " def save_configuration(btn):\n", + " target_dir = Path(target_dir_chooser_widget.selected) / \"script_languages_release\"\n", + " conf.save(AILabConfig.slc_target_dir, str(target_dir))\n", + " btn.icon = 'check'\n", + "\n", + " def on_value_change(change):\n", + " save_btn.icon = 'pen'\n", + "\n", + " save_btn.on_click(save_configuration)\n", + "\n", + " # Apply the styles and layouts to the input fields\n", + " target_dir_chooser_widget.observe(on_value_change, names=['value'])\n", + "\n", + " group_items = [header_lbl, widgets.Box([target_dir_chooser_widget], layout=ui_look.row_layout)]\n", + " items = [widgets.Box(group_items, layout=ui_look.group_layout), save_btn]\n", + " ui = widgets.Box(items, layout=ui_look.outer_layout)\n", + " \n", + "\n", + " if clone_slc_repo(conf):\n", + " return ui\n", + "\n", + "def get_existing_slc_ui(conf: Secrets) -> widgets.Widget:\n", + " \"\"\"\n", + " Creates a UI form for choosing the existing script-languages repository.\n", + " \"\"\"\n", + " if clone_slc_repo(conf):\n", + " return\n", + " default_src_dir = conf.get(AILabConfig.slc_target_dir, '')\n", + " select_default = True if default_src_dir else False\n", + " src_dir_chooser_widget = FileChooser(path=default_src_dir, select_default=select_default)\n", + " src_dir_chooser_widget.show_only_dirs = True\n", + " src_dir_chooser_widget.sandbox_path = NOTEBOOKS_DIRECTORY\n", + " \n", + " inputs = [\n", + " [\n", + " ('Flavor source directory', src_dir_chooser_widget, AILabConfig.slc_target_dir),\n", + " ]\n", + " ]\n", + "\n", + " ui_look = get_config_styles()\n", + " ui_look.row_layout.max_width = \"500px\"\n", + " ui_look.group_layout.max_width = \"500px\"\n", + " save_btn = widgets.Button(description='Save', style=ui_look.button_style, layout=ui_look.button_layout)\n", + " header_lbl = widgets.Label(value='Existing script-languages directory', style=ui_look.header_style, layout=ui_look.header_layout)\n", + "\n", + "\n", + " def save_configuration(btn):\n", + " target_dir = Path(src_dir_chooser_widget.selected)\n", + " if not (target_dir / \"flavors\").is_dir():\n", + " popup_message(\"Invalid directory. You need to choose a valid script-languages repository.\")\n", + " conf.save(AILabConfig.slc_target_dir, str(src_dir_chooser_widget.selected))\n", + " btn.icon = 'check'\n", + "\n", + " def on_value_change(change):\n", + " save_btn.icon = 'pen'\n", + "\n", + " save_btn.on_click(save_configuration)\n", + "\n", + " # Apply the styles and layouts to the input fields\n", + " src_dir_chooser_widget.observe(on_value_change, names=['value'])\n", + "\n", + " group_items = [header_lbl, widgets.Box([src_dir_chooser_widget], layout=ui_look.row_layout)]\n", + " items = [widgets.Box(group_items, layout=ui_look.group_layout), save_btn]\n", + " ui = widgets.Box(items, layout=ui_look.outer_layout)\n", + " \n", + "\n", + " return ui\n", + "\n", + "\n", + "\n", + "def get_alias_ui(conf: Secrets, default_value) -> widgets.Widget:\n", + " \"\"\"\n", + " Creates a UI form for choosing the alias name.\n", + " \"\"\"\n", + " inputs = [\n", + " ('Language Alias', widgets.Text(value=conf.get(AILabConfig.slc_alias, default_value)), AILabConfig.slc_alias),\n", + " ]\n", + "\n", + " return get_generic_config_ui(conf, [inputs], ['Language Alias'])\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f2f4a93b-6c4a-42e8-badd-7e9968ee8663", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/start.ipynb b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/start.ipynb index 3483c517..e9696091 100644 --- a/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/start.ipynb +++ b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/start.ipynb @@ -40,6 +40,7 @@ "1. [SageMaker extension](sagemaker/sme_introduction.ipynb)\n", "1. [Transformers extension](transformers/te_introduction.ipynb)\n", "1. [Cloud store](cloud/01_import_data.ipynb)\n", + "1. [Script Languages Container](script_languages_container/using_the_script_languages_container_tool.ipynb)\n", "\n" ] } @@ -60,7 +61,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.10" + "version": "3.10.12" } }, "nbformat": 4, diff --git a/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook_requirements.txt b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook_requirements.txt index c420b318..d2cdebb9 100644 --- a/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook_requirements.txt +++ b/exasol/ds/sandbox/runtime/ansible/roles/jupyter/files/notebook_requirements.txt @@ -7,3 +7,4 @@ stopwatch.py==2.0.1 boto3==1.26.163 exasol-notebook-connector @ git+https://github.com/exasol/notebook-connector.git@main pickleshare==0.7.5 # See https://github.com/exasol/ai-lab/issues/291 for details. +ipyfilechooser==0.6.0 \ No newline at end of file diff --git a/test/notebooks/nbtest_script_languages_container.py b/test/notebooks/nbtest_script_languages_container.py new file mode 100644 index 00000000..57ed24b1 --- /dev/null +++ b/test/notebooks/nbtest_script_languages_container.py @@ -0,0 +1,80 @@ +import os +from pathlib import Path + +import pytest + +from notebook_test_utils import (access_to_temp_secret_store, + access_to_temp_saas_secret_store, + run_notebook, + uploading_hack) +from exasol.nb_connector.ai_lab_config import AILabConfig as CKey, StorageBackend +from exasol.nb_connector.secret_store import Secrets + + +def _slc_repo_dir() -> Path: + return Path.cwd() / "script_languages_release" + + +def _store_slc_config(store_path: Path, store_password: str, clone_repo: bool): + + slc_source = "Clone script languages release repository" if clone_repo else "Use the existing clone" + conf = Secrets(store_path, store_password) + conf.connection() + conf.save(CKey.slc_source, slc_source) + conf.save(CKey.slc_target_dir, str(_slc_repo_dir())) + +@pytest.fixture() +def cleanup_slc_repo_dir(): + import shutil + yield + p = Path.cwd() / "script_languages_container" / "script_languages_release" + shutil.rmtree(p) + + +@pytest.mark.parametrize('access_to_temp_secret_store', [StorageBackend.onprem], indirect=True) +def test_script_languages_container_cloning_slc_repo(access_to_temp_secret_store, + cleanup_slc_repo_dir) -> None: + current_dir = Path.cwd() + store_path, store_password = access_to_temp_secret_store + store_file = str(store_path) + try: + run_notebook('main_config.ipynb', store_file, store_password) + os.chdir('./script_languages_container') + _store_slc_config(store_path, store_password, True) + run_notebook('configure_slc_repository.ipynb', store_file, store_password) + run_notebook('export_as_is.ipynb', store_file, store_password) + run_notebook('customize.ipynb', store_file, store_password) + run_notebook('test_slc.ipynb', store_file, store_password) + run_notebook('advanced.ipynb', store_file, store_password) + run_notebook('using_the_script_languages_container_tool.ipynb', store_file, store_password) + finally: + os.chdir(current_dir) + + +def _clone_slc_repo(): + from git import Repo + repo = Repo.clone_from("https://github.com/exasol/script-languages-release", _slc_repo_dir()) + repo.submodule_update(recursive=True) + + +@pytest.mark.parametrize('access_to_temp_secret_store', [StorageBackend.onprem], indirect=True) +def test_script_languages_container_with_existing_slc_repo(access_to_temp_secret_store, + cleanup_slc_repo_dir) -> None: + current_dir = Path.cwd() + store_path, store_password = access_to_temp_secret_store + store_file = str(store_path) + try: + run_notebook('main_config.ipynb', store_file, store_password) + os.chdir('./script_languages_container') + slc_repo_path = _slc_repo_dir() + assert not slc_repo_path.is_dir() + _clone_slc_repo() + _store_slc_config(store_path, store_password, False) + run_notebook('configure_slc_repository.ipynb', store_file, store_password) + run_notebook('export_as_is.ipynb', store_file, store_password) + run_notebook('customize.ipynb', store_file, store_password) + run_notebook('test_slc.ipynb', store_file, store_password) + run_notebook('advanced.ipynb', store_file, store_password) + run_notebook('using_the_script_languages_container_tool.ipynb', store_file, store_password) + finally: + os.chdir(current_dir)