Dockerfile to run Tensorflow on YARN need two part:
Base libraries which Tensorflow depends on
-
OS base image, for example
ubuntu:18.04
-
Tensorflow depended libraries and packages. For example
python
,scipy
. For GPU support, needcuda
,cudnn
, etc. -
Tensorflow package.
Libraries to access HDFS
-
JDK
-
Hadoop
Here's an example of a base image (w/o GPU support) to install Tensorflow:
FROM ubuntu:18.04
# Pick up some TF dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
curl \
libfreetype6-dev \
libpng-dev \
libzmq3-dev \
pkg-config \
python \
python-dev \
rsync \
software-properties-common \
unzip \
&& \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
RUN export DEBIAN_FRONTEND=noninteractive && apt-get update && apt-get install -yq krb5-user libpam-krb5 && apt-get clean
RUN curl -O https://bootstrap.pypa.io/get-pip.py && \
python get-pip.py && \
rm get-pip.py
RUN pip --no-cache-dir install \
Pillow \
h5py \
ipykernel \
jupyter \
matplotlib \
numpy \
pandas \
scipy \
sklearn \
&& \
python -m ipykernel.kernelspec
RUN pip --no-cache-dir install \
http://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.13.1-cp27-none-linux_x86_64.whl
On top of above image, add files, install packages to access HDFS
RUN apt-get update && apt-get install -y openjdk-8-jdk wget
# Install hadoop
ENV HADOOP_VERSION="2.9.2"
RUN wget http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-${HADOOP_VERSION}/hadoop-${HADOOP_VERSION}.tar.gz
RUN tar zxf hadoop-${HADOOP_VERSION}.tar.gz
RUN ln -s hadoop-${HADOOP_VERSION} hadoop-current
RUN rm hadoop-${HADOOP_VERSION}.tar.gz
Build and push to your own docker registry: Use docker build ...
and docker push ...
to finish this step.
We provided following examples for you to build tensorflow docker images.
For Tensorflow 1.13.1 (Precompiled to CUDA 10.x)
- docker/tensorflow/base/ubuntu-18.04/Dockerfile.cpu.tf_1.13.1: Tensorflow 1.13.1 supports CPU only.
- docker/tensorflow/with-cifar10-models/ubuntu-18.04/Dockerfile.cpu.tf_1.13.1: Tensorflow 1.13.1 supports CPU only, and included models
- docker/tensorflow/base/ubuntu-18.04/Dockerfile.gpu.tf_1.13.1: Tensorflow 1.13.1 supports GPU, which is prebuilt to CUDA10.
- docker/tensorflow/with-cifar10-models/ubuntu-18.04/Dockerfile.gpu.tf_1.13.1: Tensorflow 1.13.1 supports GPU, which is prebuilt to CUDA10, with models.
Under docker/
directory, run build-all.sh
to build Docker images. It will build following images:
tf-1.13.1-gpu-base:0.0.1
for base Docker image which includes Hadoop, Tensorflow, GPU base libraries.tf-1.13.1-gpu-base:0.0.1
for base Docker image which includes Hadoop. Tensorflow.tf-1.13.1-gpu:0.0.1
which includes cifar10 modeltf-1.13.1-cpu:0.0.1
which inclues cifar10 model (cpu only).
(No liability) You can also use prebuilt images for convenience:
- hadoopsubmarine/tf-1.13.1-gpu:0.0.1
- hadoopsubmarine/tf-1.13.1-cpu:0.0.1