Caws is a python library/executor/service for running workflows across multiple sights in a carbon and energy aware fashion. The ultimate goal is to make the environmental impact of the computing jobs run transparent to the user, and provide incentive and automation to reduce that footprint.
This repo can be used with the latest version of globus-compute-sdk. It can also schedule to any existing endpoint. However, to enable energy monitoring, the endpoints will need to be deployed using the forked version of gobus-compute-endpoint as well as the forked version of parsl. In the configuration of the enpoints, you will need to enable monitoring, enable energy monitoring, and point the monitroing database to a location accessible from your personal compute (i.e wherever you are the scheduler from).
First you have to setup a globus compute endpoint with the correct forks of all the repositories. The general globus compute documentation is here, but below I provide modified instructions that show how to configure an endpoint for use with CAWS.
On a system where you want to run a compute endpoint, use the following commands to install globus-compute-endpoint and create an endpoint:
git clone [email protected]:AK2000/funcX.git
cd funcx
git checkout power_monitoring_new
cd compute_endpoint
pip install . # Install globus compute endpoint
globus-compute-endpoint configure <ENDPOINT_NAME>
Then you have to replace the config.yaml file in ~/.globus_compute/<ENDPOINT_NAME>/config.yaml
with an appropriate config.py
file to correctly configure the endpoint. In the configuration, be sure to use the GlobusComputeEngine
and specify an energy_monitor
. This is a system specific class that tells the endpoint how it can read the total energy that is being used by a node.
Monitoring must aslo be enabled in the endpoint configration. Monitoring can be enabled in configuration of the endpoint using the monitoring infrastructure derived from parsl:
from parsl.monitoring import MonitoringHub
...
config = Config(...,
monitoring_hub=MonitoringHub(
hub_address="localhost",
hub_port=55055,
monitoring_debug=True,
resource_monitoring_interval=1,
logging_endpoint="postgresql://<user>:<password>@<address>/monitoring"),
)
The logging_endpoint
is a relational database that stores resource and task monitoring information and serves as a source for the CAWS client. Currently, you must host this database yourself.
A sample configuration with both monitoring enabled and the correct executor configuration is in docs/sample_config.json
.
After the Globus Compute Endpoint is properly configured, start the endpoint:
globus-compute-endpoint start <ENDPOINT_NAME>
First clone the repository. Make sure to also download the SeBS data submodule:
git clone [email protected]:/AK2000/caws.git
git submodule update --init --recursive
cd caws
Finally, to install this repo with it's dependencies and the experiments, run:
$ pip install .
Be sure to also setup the environment variable. Alternatively, you can pass it as an argument with every executor that you create.
export ENDPOINT_MONITOR_DEFAULT=<DATABASE_URI>
To run the test suite use:
$ pytest --endpoint_id <COMPUTE_ID>
The experiments require some additional packages to be installed on the endpoint. to install those packages, on the endpoint run:
export ENV_DIR=<PATH_TO_ENV>
wget https://raw.githubusercontent.com/AK2000/caws/master/scripts/requirements.txt
conda install -p ${ENV_DIR} --yes --file requirements.txt
wget -q https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-amd64-static.tar.xz
tar -xf ffmpeg-release-amd64-static.tar.xz
rm *.tar.xz
mv ffmpeg-* ffmpeg
rm ffmpeg/ffprobe
# make the binary executable
chmod 755 ffmpeg/ffmpeg
# move the binary onto the environment path
mv ffmpeg/ffmpeg ${ENV_DIR}/bin/