Skip to content

Commit

Permalink
Refactor oci-copy to be more efficient
Browse files Browse the repository at this point in the history
Originally, this task would download all artifacts requested in the
input file, check them all, and then upload them all to the registry in
one invocation of "oras push".

This had two problems. First, if "oras push" flaked out part way
through and the user needed to retry their pipeline, the entire download
section would need to be run again needlessly. Second, for extremely
large artifacts with lots of medium-sized files, an enormous PVC would
be needed to hold all of them between download and push to the registry.

The change here addresses both problems.

First, files are downloaded, checked, pushed to the registry and then
deleted from local storage - one at a time. This obviates the need for a
large volume to store all files at once, since only enough storage is
needed to store one file, not all of them.

Second, as files are considered, first the registry is checked to see if
the blob has already been pushed there. If it has, then skip the
download step. This has the effect of greatly improving the runtime for
artifacts where only one or two of many files have changed since the
last taskrun.
  • Loading branch information
ralphbean committed Jul 6, 2024
1 parent 12f31d3 commit 0f493b5
Showing 1 changed file with 58 additions and 19 deletions.
77 changes: 58 additions & 19 deletions task/oci-copy/0.1/oci-copy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -99,36 +99,75 @@ spec:
set -u
echo "Selecting auth for $IMAGE"
select-oci-auth $IMAGE > auth.json
echo "Extracting artifact_type"
ARTIFACT_TYPE=$(cat "$(pwd)/source/$OCI_COPY_FILE" | yq '.artifact_type')
REPO=$(echo ${IMAGE} | awk -F ':' '{print $1}')
echo "Found that ${REPO} is the repository for ${IMAGE}"
cat >artifact-manifest.json <<EOL
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"artifactType": "${ARTIFACT_TYPE}",
"config": {
"mediaType": "application/vnd.oci.empty.v1+json",
"digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
"size": 2,
"data": "e30="
},
"layers": [],
"annotations": {
"org.opencontainers.image.created": "$(date -u +%Y-%m-%dT%H:%M:%SZ)"
}
}
EOL
for varfile in /var/workdir/vars/*; do
echo "Reading $varfile"
source $varfile
echo "Downloading $OCI_SOURCE to $OCI_FILENAME"
curl "${CURL_ARGS[@]}" --fail --silent --show-error --location $OCI_SOURCE -o $OCI_FILENAME
echo "Checking to see if blob $OCI_ARTIFACT_DIGEST exists"
if [[ $(oras blob fetch --registry-config auth.json --descriptor "${REPO}@sha256:${OCI_ARTIFACT_DIGEST}") ]]; then
echo "Blob for ${OCI_FILENAME} already exists in the registry at ${REPO}@sha256:${OCI_ARTIFACT_DIGEST}. Skipping download."
else
echo "Blob for ${OCI_FILENAME} does not yet exist in the registry at ${REPO}@sha256:${OCI_ARTIFACT_DIGEST}."
echo "Downloading $OCI_SOURCE to $OCI_FILENAME"
curl "${CURL_ARGS[@]}" --fail --silent --show-error --location $OCI_SOURCE -o $OCI_FILENAME
echo "Confirming that digest of $OCI_FILENAME matches expected $OCI_ARTIFACT_DIGEST"
echo "$OCI_ARTIFACT_DIGEST $OCI_FILENAME" | sha256sum --check
echo "Confirming that digest of $OCI_FILENAME matches expected $OCI_ARTIFACT_DIGEST"
echo "$OCI_ARTIFACT_DIGEST $OCI_FILENAME" | sha256sum --check
echo "Appending to arguments for $OCI_FILENAME of type $OCI_ARTIFACT_TYPE"
args+=("${OCI_FILENAME}:${OCI_ARTIFACT_TYPE}")
done
echo "Pushing blob of $OCI_FILENAME of type $OCI_ARTIFACT_TYPE"
oras blob push --registry-config auth.json quay.io/redhat-user-workloads/rhel-ai-tenant/models/mixtral-8x7b-instruct-v0-1 --media-type ${OCI_ARTIFACT_TYPE} ${OCI_FILENAME}
if [ -z "${args}" ]; then
echo "No files found. Something is very wrong. Skipping upload."
exit 1;
fi
echo "Removing local copy of $OCI_FILENAME to save space."
rm ${OCI_FILENAME}
fi
echo "Extracting artifact_type"
ARTIFACT_TYPE=$(cat "$(pwd)/source/$OCI_COPY_FILE" | yq '.artifact_type')
echo "Grabbing descriptor of blob from the registry"
oras blob fetch --registry-config auth.json --descriptor "${REPO}@sha256:${OCI_ARTIFACT_DIGEST}" > descriptor.json
echo "Selecting auth for $IMAGE"
select-oci-auth $IMAGE > auth.json
echo "Setting mediaType to ${OCI_ARTIFACT_TYPE}"
yq -oj -i '.mediaType = "'${OCI_ARTIFACT_TYPE}'"' descriptor.json
echo "Inserting org.opencontainers.image.title = ${OCI_FILENAME} annotation"
yq -oj -i '.annotations."org.opencontainers.image.title" = "'${OCI_FILENAME}'"' descriptor.json
echo "Appending blob descriptor for ${OCI_FILENAME} to the overall artifact manifest for ${IMAGE}"
yq -oj -i ".layers += $(cat descriptor.json)" artifact-manifest.json
echo "Done with ${OCI_FILENAME}."
done
echo "Pushing contents to ${IMAGE}"
oras push --no-tty --registry-config auth.json --artifact-type ${ARTIFACT_TYPE} "${IMAGE}" "${args[@]}"
echo "Pushing complete artifact manifest to ${IMAGE}"
oras manifest push --no-tty --registry-config auth.json "${IMAGE}" artifact-manifest.json
IMAGE_INDEX_DIGEST=$(oras resolve --registry-config auth.json "${IMAGE}")
echo -n "$IMAGE_INDEX_DIGEST" | tee "$(results.IMAGE_DIGEST.path)"
RESULTING_DIGEST=$(oras resolve --registry-config auth.json "${IMAGE}")
echo -n "$RESULTING_DIGEST" | tee "$(results.IMAGE_DIGEST.path)"
echo -n "$IMAGE" | tee "$(results.IMAGE_URL.path)"
volumeMounts:
- mountPath: /var/lib/containers
Expand Down

0 comments on commit 0f493b5

Please sign in to comment.