-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add self hosted runner to get GPU testing on CSD3 #139
Conversation
There is something wrong here? How is architecture defined? I'd expect something like: using Test, CUDA, OceanBioME
architectures = CUDA.has_cuda() ? tuple(GPU()) : tuple(CPU())
for arch in architectures
test_this_and_that(arch)
end Then we only need a single |
yeah, agree this PR is definitely not the best way todo these things. Given the problems with getting GPU jobs completed, I've had another temporary idea for testing on GPU and am running a google colab notebook to test in. I might close this PR as I've made the useful changes also in #138 |
If you want to run tests like Oceananigans, you could invest a bit of money in a local server with a GPU and run all your tests there via buildkite. |
That's the solution we're going for! |
GitHub recently changed how they do self-hosted runners so I can run one on CSD3. This is a very hacky way of getting the tests to submit a slurm script and then wait for it to finish, and check if it worked.
I'm going to have a look at see if it is cleaner and not too much harder to setup buildkite like Oceananigans uses.I've realised that the Oceanannigans buildkite agents are individual computers not using slurm so will have to wait for that to be possible before we use buildkite.We're also going to have the problem that the GPU nodes on CSD3 are always busy at the moment so it takes a long time for the jobs to start.
To do: