Gotopus is a minimalistic tool that runs arbitrary commands concurrently. You define your commands with their dependencies and Gotopus will take care of the rest, running them concurrently when possible.
- Concurrently run steps, speeding up running time
- Local or remote configs
- Easy to install
- Circular dependency detection
- Clean step definition with YAML
- Builtin and user environment variables
curl -sf https://gobinaries.com/lherman-cs/gotopus | sh
Usage: gotopus <url or filepath> ...
-max_workers uint
limits the number of workers that can run concurrently (default 0 or limitless)
# examples/basic.yaml
jobs:
job1:
steps:
- run: sleep 1 && echo "job1"
job2:
needs:
- job1
steps:
- run: echo "job2"
job3:
steps:
- run: echo "job3"
To use basic.yaml
above, you can run the following command:
gotopus basic.yaml
Or you can simply give a URL to this file:
gotopus https://raw.githubusercontent.com/lherman-cs/gotopus/master/examples/basic.yaml
By default, if you don't set max_workers
to any number greater than 0, gotopus will create a pool of workers without a limit in lazy way. From the example above, 2 workers will be allocated instead of 3. The process looks like following:
job1 gets scheduled
spawn worker #0
worker #0 executes job1
job3 gets scheduled
spawn worker #1
worker #1 executes job3
job3 finishes
job1 finishes
either worker #0 or #1 executes job2
job2 finishes
Whenever a step runs, there are 3 kinds of environments that are going to be set and they'll have the priority order (in case of a conflict happens, the higher priority environment variable will be chosen) as listed below, where user environment variables will have the highest priority:
-
User: these environment variables are defined by the user in yaml in each step.
-
Builtin: environment variables that come from gotopus and they'll be prefixed with
GOTOPUS_
.GOTOPUS_JOB_ID
GOTOPUS_JOB_NAME
GOTOPUS_STEP_NAME
GOTOPUS_WORKER_ID
-
System: inherits all the environments variables from the system when you run gotopus.
Following is an example how you define and use environment variables:
# examples/env.yaml
jobs:
job:
steps:
- name: Install dependencies
run: echo "$GOTOPUS_STEP_NAME"
- run: echo "$name"
env:
name: Lukas Herman
Let's imagine that there are 2 commands that we want to execute:
- First command:
sleep 2 && echo "job 1"
- Second command:
sleep 3 && echo "job 2"
Normally, this would take 5 seconds to finish since you need to run them sequentially. But, if you run this concurrently, this would take ~3 seconds even when you only have 1 CPU core!. This is possible because they run conccurrently NOT in parallel. For more information, there's this awesome video from Rob Pike that specfically talks about "Concurency Is Not Parallelism".
time gotopus https://raw.githubusercontent.com/lherman-cs/gotopus/master/examples/concurrency.yaml
This is intentional because I think Github Actions format is pretty clean and also one of my inspirations to create Gotopus. Although Gotopus and Github Actions seem to have similar functionalities, you can definitelly use them together! For example, you can run steps of a job concurrently with Gotopus (running steps concurrently is not supported currently: https://github.community/t5/GitHub-Actions/Steps-in-parallel/td-p/32635).