Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache docker when using Github actions #2196

Merged
merged 9 commits into from
Sep 20, 2023
Merged

Cache docker when using Github actions #2196

merged 9 commits into from
Sep 20, 2023

Conversation

causten
Copy link
Collaborator

@causten causten commented Sep 16, 2023

This PR is to reduce the load on CI servers by building docker images once and then uploading to the dockerhub registry. Future PR to handle Jenkins.

@codecov
Copy link

codecov bot commented Sep 16, 2023

Codecov Report

Merging #2196 (88147b0) into develop (0bb8508) will not change coverage.
The diff coverage is n/a.

❗ Current head 88147b0 differs from pull request most recent head 1f6ac2b. Consider uploading reports for the commit 1f6ac2b to get more accurate results

@@           Coverage Diff            @@
##           develop    #2196   +/-   ##
========================================
  Coverage    91.49%   91.49%           
========================================
  Files          427      427           
  Lines        15953    15953           
========================================
  Hits         14596    14596           
  Misses        1357     1357           

@migraphx-bot
Copy link
Collaborator

migraphx-bot commented Sep 16, 2023

Test Batch Rate new
1f6ac2
Rate old
340012
Diff Compare
torchvision-resnet50 64 2,287.20 2,282.79 0.19%
torchvision-resnet50_fp16 64 5,350.95 5,367.78 -0.31%
torchvision-densenet121 32 1,822.81 1,835.07 -0.67%
torchvision-densenet121_fp16 32 3,393.31 3,391.29 0.06%
torchvision-inceptionv3 32 1,344.13 1,336.41 0.58%
torchvision-inceptionv3_fp16 32 2,597.91 2,588.08 0.38%
cadene-inceptionv4 16 677.34 678.73 -0.20%
cadene-resnext64x4 16 591.47 589.72 0.30%
slim-mobilenet 64 7,223.13 7,213.88 0.13%
slim-nasnetalarge 64 237.00 237.05 -0.02%
slim-resnet50v2 64 2,530.33 2,528.69 0.06%
bert-mrpc-onnx 8 721.79 721.09 0.10%
bert-mrpc-tf 1 390.55 390.79 -0.06%
pytorch-examples-wlang-gru 1 301.01 310.08 -2.93%
pytorch-examples-wlang-lstm 1 309.47 310.72 -0.40%
torchvision-resnet50_1 1 557.95 556.07 0.34%
torchvision-inceptionv3_1 1 305.80 307.56 -0.57%
cadene-dpn92_1 1 345.98 352.44 -1.83%
cadene-resnext101_1 1 221.06 220.50 0.25%
slim-vgg16_1 1 224.67 224.86 -0.08%
slim-mobilenet_1 1 1,470.99 1,483.04 -0.81%
slim-inceptionv4_1 1 219.87 221.72 -0.84%
onnx-taau-downsample 1 322.88 322.55 0.10%
dlrm-criteoterabyte 1 21.69 21.68 0.03%
dlrm-criteoterabyte_fp16 1 40.66 40.62 0.10%
agentmodel 1 5,946.36 5,836.42 1.88%
unet_fp16 2 55.16 55.08 0.15%

This build is OK for merge ✅

@migraphx-bot
Copy link
Collaborator

migraphx-bot commented Sep 16, 2023


    :white_check_mark:bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

    :white_check_mark:bert-mrpc-tf: PASSED: MIGraphX meets tolerance

    :white_check_mark:pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

    :white_check_mark:pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

    :white_check_mark:torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

🔴torchvision-inceptionv3_1: FAILED: MIGraphX is not within tolerance - check verbose output


🔴cadene-dpn92_1: FAILED: MIGraphX is not within tolerance - check verbose output


    :white_check_mark:cadene-resnext101_1: PASSED: MIGraphX meets tolerance

    :white_check_mark:slim-vgg16_1: PASSED: MIGraphX meets tolerance

    :white_check_mark:slim-mobilenet_1: PASSED: MIGraphX meets tolerance

🔴slim-inceptionv4_1: FAILED: MIGraphX is not within tolerance - check verbose output


    :white_check_mark:dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

    :white_check_mark:agentmodel: PASSED: MIGraphX meets tolerance

    :white_check_mark:unet: PASSED: MIGraphX meets tolerance

@causten
Copy link
Collaborator Author

causten commented Sep 16, 2023

Internal server built the docker image in 43 minutes, The free 2 core server available from Github took 2 hours

@causten
Copy link
Collaborator Author

causten commented Sep 16, 2023

With no changes to any of the docker affecting files and Tidy/Format/cppcheck already cached, time on the larger runners dropped from 170 to 6.35 minutes

@causten causten requested a review from pfultz2 September 17, 2023 02:08
@causten causten marked this pull request as ready for review September 17, 2023 02:12
@causten causten changed the title Initial code for docker caching Cache docker when using Github actions Sep 17, 2023
@TedThemistokleous TedThemistokleous added Continous Integration Pull request updates parts of continous integration pipeline high priority A PR with high priority for review and merging. labels Sep 18, 2023
.github/workflows/ci.yaml Outdated Show resolved Hide resolved
.github/workflows/ci.yaml Show resolved Hide resolved
.github/workflows/ci.yaml Show resolved Hide resolved
.github/workflows/ci.yaml Outdated Show resolved Hide resolved
.github/workflows/ci.yaml Outdated Show resolved Hide resolved
DOCKERIMAGE: ${{ needs.check_image.outputs.imagetag }}
run: |
echo $DOCKER_TOKEN | docker login -u $DOCKER_USER --password-stdin
docker build . --file hip-clang.docker --tag $DOCKERIMAGE;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use --cache-from so we can reuse caches from previous builds, instead of rebuilding all layers again.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a code change to do that. I'd like to first get this in "as is" and than can create a PR that doubles down on the caching.

run: |
echo $DOCKER_TOKEN | docker login -u $DOCKER_USER --password-stdin
docker build . --file hip-clang.docker --tag $DOCKERIMAGE;
docker push $DOCKERIMAGE;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also push rocm/migraphx-private:latest and rocm/migraphx-private:${BRANCH_NAME} so we can use these images in --cache-from.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup will do exactly that in next PR

- name: Create Image Tag
id: image_hash
run: |
echo "imagetag=rocm/migraphx-private:hip-clang-${{hashFiles('**/hip-clang.docker', '**/*requirements.txt', '**/install_prereqs.sh', '**/rbuild.ini')}}" >> $GITHUB_OUTPUT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should have one variable for image name and then another for the tag.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do that as part of the --cache-from update in another PR

id: image_hash
run: |
echo "imagetag=rocm/migraphx-private:hip-clang-${{hashFiles('**/hip-clang.docker', '**/*requirements.txt', '**/install_prereqs.sh', '**/rbuild.ini')}}" >> $GITHUB_OUTPUT
echo "imagetag_sles=rocm/migraphx-private:hip-clang-sles-${{hashFiles('**/tools/docker/sles.docker', '**/*requirements.txt', '**/install_prereqs.sh', '**/rbuild.ini')}}" >> $GITHUB_OUTPUT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use a different image name such as rocm/migraphx-sles-private.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, I created another repo rocm/migraphx-sles-private

@causten causten merged commit 482e8d6 into develop Sep 20, 2023
15 checks passed
@causten causten deleted the efficient_docker branch September 20, 2023 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Continous Integration Pull request updates parts of continous integration pipeline high priority A PR with high priority for review and merging.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants