-
Notifications
You must be signed in to change notification settings - Fork 62
install env setup
Environment Setup#
GraphStorm can be installed as a pip package. However, configuring a GraphStorm environment in various Operation Systems is non-trivial, therefore, GraphStorm provides Docker-based running environment for easy deployment.
1. Setup GraphStorm Docker Environment#
Prerequisites#
Docker: You need to install Docker in your environment as the Docker documentation suggests, and the Nvidia Container Toolkit.
For example, in an AWS EC2 instance without Docker preinstalled, you can run the following commands to install Docker.
sudo apt-get update sudo apt update sudo apt install Docker.io
If using AWS Deep Learning AMI GPU version, the Nvidia Container Toolkit has been preinstalled.
GPU: The current version of GraphStorm requires at least one Nvidia GPU installed in the instance.
Build a GraphStorm Docker image from source code#
Please use the following command to build a Docker image from source:
git clone https://github.com/awslabs/graphstorm.git cd /path-to-graphstorm/docker/ bash /path-to-graphstorm/docker/build_docker_oss4local.sh /path-to-graphstorm/ docker-name docker-tag
There are three arguments of the build_docker_oss4local.sh
:
path-to-graphstorm (required), is the absolute path of the “graphstorm” folder, where you cloneed the GraphStorm source code. For example, the path could be
/code/graphstorm
.docker-name (optional), is the assigned name of the to be built Docker image. Default is
graphstorm
.docker-tag (optional), is the assigned tag name of the to be built docker image. Default is
local
.
You can use the below command to check if the new Docker image is created successfully.
docker image ls
If the build succeeds, there should be a new Docker image, named <docker-name>:<docker-tag>, e.g., graphstorm:local
.
Create a GraphStorm Container#
First, you need to create a GraphStorm container based on the Docker image built in the previous step.
Run the following command:
nvidia-docker run --network=host -v /dev/shm:/dev/shm/ -d --name test graphstorm:local
This command will create a GraphStorm container, named test
and run the container as a daemon.
Then connect to the container by running the following command:
docker container exec -it test /bin/bash
If succeeds, the command prompt will change to the container’s, like
root@ip-address:/#
2. Setup GraphStorm with pip Packages#
Prerequisites#
Linux OS: The current version of GraphStorm supports Linux as the Operation System. We tested GraphStorm on both Ubuntu (22.04 or later version) and Amazon Linux 2.
GPU: The current version of GraphStorm requires at least one Nvidia GPU installed in the instance.
Python3: The current version of GraphStorm requires Python installed with the version larger than 3.7.
Install GraphStorm#
Users can use pip
or pip3
to install GraphStorm.
pip install graphstorm
Install Dependencies#
GraphStorm requires a set of dependencies, which can be installed with the following pip
or pip3
commands.
pip install boto3==1.26.126 pip install botocore==1.29.126 pip install h5py==3.8.0 pip install scipy pip install tqdm==4.65.0 pip install pyarrow==12.0.0 pip install transformers==4.28.1 pip install pandas pip install scikit-learn pip install ogb==1.3.6 pip install psutil==5.9.5 pip install torch==1.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116 pip install dgl==1.0.3+cu117 -f https://data.dgl.ai/wheels/cu117/repo.html
Configure SSH No-password login#
Use the following commands to configure a local SSH no-password login that GraphStorm relies on.
ssh-keygen -t rsa -f ~/.ssh/id_rsa -N '' cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
Then use this command to test if the SSH no-password login works.
ssh 127.0.0.1
If everything is right, the above command will enter another Linux shell process. Then exit this new shell with the command exit
.
Clone GraphStorm Toolkits (Optional)#
GraphStorm provides a set of toolkits, including scripts, tools, and examples, which can facilitate the use of GraphStorm.
graphstorm/training_scripts/ and graphstorm/inference_scripts/ include examplar configuration yaml files that used in GraphStorm documentations and tutorials.
graphstorm/examples includes Python code for customized models and customized data preparation.
graphstorm/tools includes graph partition and related Python code.
graphstorm/sagemaker include commands and code to run GraphStorm on Amazon SageMaker.
Users can clone GraphStorm source code to obtain these toolkits.
git clone https://github.com/awslabs/graphstorm.git
Warning
If use this method to setup GraphStorm environment, please replace the argument --ssh-port
of in launch commands in GraphStorm’s tutorials from 2222 with 22.
Get Started
- Environment Setup
- Standalone Mode Quick Start Tutorial
- Use Your Own Data Tutorial
- GraphStorm Configurations
Scale to Giant Graphs
Advanced Topics