Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] create a new 'rapids' conda environment instead of installing packages in the 'base' environment #713

Closed
wants to merge 15 commits into from
Closed
Show file tree
Hide file tree
Changes from 14 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .github/workflows/test-notebooks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,9 @@ jobs:
rapids-logger "nvidia-smi"
nvidia-smi
- name: Test notebooks
run: /home/rapids/test_notebooks.py -i /home/rapids/notebooks -o /home/rapids/notebooks_output
run: |
. /opt/conda/etc/profile.d/conda.sh; conda activate rapids
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the environment hacking is this necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you're right, it shouldn't be! I'll try reverting it.

/home/rapids/test_notebooks.py -i /home/rapids/notebooks -o /home/rapids/notebooks_output
- name: Install awscli
if: '!cancelled()'
run: |
Expand Down
48 changes: 43 additions & 5 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -57,14 +57,52 @@ conda config --show-sources
conda list --show-channel-urls

# Install RAPIDS
mamba install -y -n base \
mamba create -y -n rapids \
"rapids=${RAPIDS_VER}.*" \
"python=${PYTHON_VER}.*" \
"cuda-version=${CUDA_VER%.*}.*" \
ipython

conda clean -afy
EOF

# remove the 'conda activate base' in .bashrc that comes from miniforge-cuda,
# to avoid that happening in the entrypoint script
RUN sed -i.bak '/conda activate base/d' ~/.bashrc \
&& rm -f ./*.bak

# manually activate the 'rapids' environment, to cause any filesystem changes made by activation scripts
RUN . /opt/conda/etc/profile.d/conda.sh; conda activate rapids

# Set more environment variables, to mimic what would happen with 'conda activate rapids'
#
# This is a workaround to allow RAPIDS libraries to be accessible to processes that bypass the entrypoint,
# even though they aren't (and can't be) installed in the 'base' environment.
# See the discussion in https://github.com/rapidsai/docker/issues/712.
#
# This list was generated by building this image without this ENV layer and running:
#
# env > ./old.txt
# . /opt/conda/etc/profile.d/conda.sh; conda activate rapids
# env > ./new.txt
# diff -u ./old.txt ./new.txt
#
ENV \
CONDA_DEFAULT_ENV=rapids \
CONDA_PREFIX=/opt/conda/envs/rapids \
CONDA_PREFIX_1=/opt/conda/envs/rapids \
CONDA_PREFIX_2=/opt/conda \
CONDA_PROMPT_MODIFIER="(rapids)" \
CPL_ZIP_ENCODING=UTF-8 \
GDAL_DATA=/opt/conda/envs/rapids/share/gdal \
GDAL_DRIVER_PATH=/opt/conda/envs/rapids/lib/gdalplugins \
GSETTINGS_SCHEMA_DIR=/opt/conda/envs/rapids/share/glib-2.0/schemas \
GSETTINGS_SCHEMA_DIR_CONDA_BACKUP= \
PATH=/opt/conda/envs/rapids/bin:/opt/conda/condabin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
PROJ_DATA=/opt/conda/envs/rapids/share/proj \
PROJ_NETWORK=ON \
XML_CATALOG_FILES="file:///opt/conda/envs/rapids/etc/xml/catalog file:///etc/xml/catalog"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this is to make it feel like conda activate rapids was run even though it wasn't, in situations where you can't rely on the entrypoint script being run (described by @jacobtomlinson in #712 (comment)).

It's a hack and could hopefully be reverted completely in RAPIDS 24.12 through some combination of the following:

If reviewers agree with this approach, I'll write up a separate issue to track that work of reverting all of this.


NOTE: I'm intentionally not doing this in the raft-ann-bench images... those are expected to be used with explicit entrypoints, as far as I can tell, and I've modified all those entrypoints with the appropriate conda activation commands.


COPY entrypoint.sh /home/rapids/entrypoint.sh

ENTRYPOINT ["/home/rapids/entrypoint.sh"]
Expand All @@ -90,20 +128,20 @@ COPY --from=dependencies --chown=rapids /test_notebooks_dependencies.yaml test_n
COPY --from=dependencies --chown=rapids /notebooks /home/rapids/notebooks

RUN <<EOF
mamba env update -n base -f test_notebooks_dependencies.yaml
mamba env update -n rapids -f test_notebooks_dependencies.yaml
conda clean -afy
EOF

RUN <<EOF
mamba install -y -n base \
mamba install -y -n rapids \
"jupyterlab=4" \
dask-labextension \
jupyterlab-nvdashboard
conda clean -afy
EOF

# Disable the JupyterLab announcements
RUN /opt/conda/bin/jupyter labextension disable "@jupyterlab/apputils-extension:announcements"
RUN /opt/conda/envs/rapids/bin/jupyter labextension disable "@jupyterlab/apputils-extension:announcements"

ENV DASK_LABEXTENSION__FACTORY__MODULE="dask_cuda"
ENV DASK_LABEXTENSION__FACTORY__CLASS="LocalCUDACluster"
Expand Down Expand Up @@ -140,7 +178,7 @@ LABEL com.nvidia.workbench.package-manager.apt.binary="/usr/bin/apt"
LABEL com.nvidia.workbench.package-manager.apt.installed-packages=""
LABEL com.nvidia.workbench.package-manager.conda3.binary="/opt/conda/bin/conda"
LABEL com.nvidia.workbench.package-manager.conda3.installed-packages="rapids cudf cuml cugraph rmm pylibraft cuspatial cuxfilter cucim xgboost jupyterlab"
LABEL com.nvidia.workbench.package-manager.pip.binary="/opt/conda/bin/pip"
LABEL com.nvidia.workbench.package-manager.pip.binary="/opt/conda/envs/rapids/bin/pip"
LABEL com.nvidia.workbench.package-manager.pip.installed-packages="jupyterlab-nvdashboard"
LABEL com.nvidia.workbench.programming-languages="python3"
LABEL com.nvidia.workbench.schema-version="v2"
Expand Down
6 changes: 4 additions & 2 deletions context/entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,19 +10,21 @@ EOF

if [ -e "/home/rapids/environment.yml" ]; then
echo "environment.yml found. Installing packages."
timeout ${CONDA_TIMEOUT:-600} mamba env update -n base -f /home/rapids/environment.yml || exit $?
timeout ${CONDA_TIMEOUT:-600} mamba env update -n rapids -f /home/rapids/environment.yml || exit $?
fi

if [ "$EXTRA_CONDA_PACKAGES" ]; then
echo "EXTRA_CONDA_PACKAGES environment variable found. Installing packages."
timeout ${CONDA_TIMEOUT:-600} mamba install -n base -y $EXTRA_CONDA_PACKAGES || exit $?
timeout ${CONDA_TIMEOUT:-600} mamba install -n rapids -y $EXTRA_CONDA_PACKAGES || exit $?
fi

if [ "$EXTRA_PIP_PACKAGES" ]; then
echo "EXTRA_PIP_PACKAGES environment variable found. Installing packages.".
timeout ${PIP_TIMEOUT:-600} pip install $EXTRA_PIP_PACKAGES || exit $?
fi

. /opt/conda/etc/profile.d/conda.sh; conda activate rapids

# Run whatever the user wants.
if [ "${UNQUOTE}" = "true" ]; then
exec $@
Expand Down
4 changes: 3 additions & 1 deletion context/raft-ann-bench/get_datasets.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@

set -eo pipefail

export CONDA_PREFIX=/opt/conda
. /opt/conda/etc/profile.d/conda.sh; conda activate rapids

export CONDA_PREFIX=/opt/conda/envs/rapids

python -m raft_ann_bench.get_dataset --dataset deep-image-96-angular --normalize --dataset-path /home/rapids/preloaded_datasets
python -m raft_ann_bench.get_dataset --dataset fashion-mnist-784-euclidean --dataset-path /home/rapids/preloaded_datasets
Expand Down
4 changes: 3 additions & 1 deletion context/raft-ann-bench/run_benchmark.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,9 @@ function hasArg {
(( ${NUMARGS} != 0 )) && (echo " ${ARGS} " | grep -q " $1 ")
}

export CONDA_PREFIX=/opt/conda
. /opt/conda/etc/profile.d/conda.sh; conda activate rapids

export CONDA_PREFIX=/opt/conda/envs/rapids
export DATASET_ARG=$1
export GET_DATASET_ARGS=$2
export RUN_ARGS=$3
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,9 @@ function hasArg {
(( ${NUMARGS} != 0 )) && (echo " ${ARGS} " | grep -q " $1 ")
}

export CONDA_PREFIX=/opt/conda
. /opt/conda/etc/profile.d/conda.sh; conda activate rapids

export CONDA_PREFIX=/opt/conda/envs/rapids
export DATASET_ARG=$1
export GET_DATASET_ARGS=$2
export RUN_ARGS=$3
Expand Down
2 changes: 1 addition & 1 deletion dockerhub-readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ The `rapidsai/notebooks` container has notebooks for the RAPIDS libraries in `/h

### Extending RAPIDS Images

All RAPIDS images use `conda` as their package manager, and all RAPIDS packages are available in the `base` conda environment. These image run as the `rapids` user.
All RAPIDS images use `conda` as their package manager, and all RAPIDS packages are available in the `rapids` conda environment. These image run as the `rapids` user.

### Access Documentation within Notebooks

Expand Down
2 changes: 1 addition & 1 deletion raft-ann-bench/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ docker run --gpus all --rm -it \
This will drop you into a command line in the container, with RAFT and the `raft_ann_benchmarks` python package ready to use:

```
(base) root@00b068fbb862:/home/rapids#
(rapids) root@00b068fbb862:/home/rapids#
```

Additionally, the containers could be run in dettached form without any issue.
7 changes: 2 additions & 5 deletions raft-ann-bench/cpu/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ SHELL ["/bin/bash", "-euo", "pipefail", "-c"]
RUN <<EOF
mkdir /data
chmod 777 /data
echo ". /opt/conda/etc/profile.d/conda.sh; conda activate base" >> /etc/bash.bashrc
echo ". /opt/conda/etc/profile.d/conda.sh; conda activate rapids" >> /etc/bash.bashrc
EOF

# we need perl temporarily for the remaining benchmark perl scripts
Expand All @@ -27,10 +27,7 @@ RUN apt-get install perl -y
# an older conda with newer packages still works well
# ref: https://github.com/rapidsai/ci-imgs/issues/185
RUN <<EOF
mamba update --all -y -n base
mamba install -y -n base "python=${PYTHON_VER}"
mamba update --all -y -n base
mamba install -y -n base \
mamba create -y -n rapids \
"raft-ann-bench-cpu=${RAPIDS_VER}.*" \
"python=${PYTHON_VER}"
conda clean -afy
Expand Down
5 changes: 2 additions & 3 deletions raft-ann-bench/gpu/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,14 @@ SHELL ["/bin/bash", "-euo", "pipefail", "-c"]
RUN <<EOF
mkdir /data
chmod 777 /data
echo ". /opt/conda/etc/profile.d/conda.sh; conda activate base" >> /etc/bash.bashrc
echo ". /opt/conda/etc/profile.d/conda.sh; conda activate rapids" >> /etc/bash.bashrc
EOF

# we need perl temporarily for the remaining benchmark perl scripts
RUN apt-get install perl -y

RUN <<EOF
mamba update --all -y -n base
mamba install -y -n base \
mamba create -y -n rapids \
"raft-ann-bench=${RAPIDS_VER}.*" \
"cuda-version=${CUDA_VER%.*}.*"
conda clean -afy
Expand Down