-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docker, BlobInfoCache: try to reuse blobs from set of all known compressed blobs when pushing across registries #1645
docker, BlobInfoCache: try to reuse blobs from set of all known compressed blobs when pushing across registries #1645
Conversation
53183a7
to
1dbb4a3
Compare
@mtrmac Following patch works for me and unlike #1461 current patch works without breaking/cleaning existing written However since CandidateLocations2 now returns all blobs with lower priority so i think some tests needs to be retrofitted, but I'll do that once you give a quick look to the patch and verify if its on the right track. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I think we really need to have an explicit concept of “
BICLocationReference
not set” (and not create those entries that for the v1 callers). Then we can avoid the O(BICLocationReference) loop and adding all the O(BICLocationReference) individual entries.- A downside of doing just that, which I didn’t originally realize, is that we would not have any timestamp for the location-less entries.
- We might just live with that, there are not likely to be that many digests of the same blob floating around. I think that’s good enough for now, with an explicit comment.
- Alternatively, it would help to replace the existing
knownLocationsBucket
organized transport→scope→digest→(location, timestamp), with a new version organized transport→digest→scope→(location, timestamp); that way we would only need to iterate through the scopes (=registries) that contain the digest, not all scopes, scaling hopefully much better.
- A downside of doing just that, which I didn’t originally realize, is that we would not have any timestamp for the location-less entries.
- It seems to me this could basically be a small addition to the
appendReplacementCandidates
function, adding one moreBICLocationReference
-unknown entry, and then teaching…Prioritize…
to handle that.
Right now my cache doesn’t have a single |
67235a1
to
0b5a3be
Compare
@mtrmac Did not completely understood this comment. Is this comment more like a future note or some change which we are expecting in this PR but since it looks like a different scope so i'm expecting that this is a note for future. |
This is my argument why we might not need to collect the timestamps to sort the location-less candidates accurately, at least in this version. |
0580587
to
600a8b9
Compare
All changes addressed except modifying heuristic check which i'll push soon. |
600a8b9
to
0fcbf47
Compare
822867a
to
26a55a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BoltDB comments mostly apply to memory equally.
78bdd13
to
e2a1ceb
Compare
6c2de11
to
7c41d67
Compare
f2fb916
to
ca111f7
Compare
Yeah, I think it would be best not to worry about the new without- |
ca111f7
to
92cdb4f
Compare
@mtrmac Done, addressed comments as well as some old comments. |
92cdb4f
to
d99cfa9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable overall.
I think the one major missing part is expanding the tests in pkg/blobinfocache/internal/test
, so that the new code in storage backends code is tested. The logic is not quite trivial, so I’d prefer to have a good test coverage.
4d3fe76
to
a7cd83c
Compare
@mtrmac PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Hopefully the last round…
…oss registries It seems we try to reuse blobs only for the specified registry, however we can have valid known compressed digests across registry as well following pr attempts to use that by doing following steps. * `CandidateLocations2` now processes all known blobs and appends them to returned candidates at the lowest priority. As a result when `TryReusingBlob` tries to process these candidates and if the blobs filtered by the `Opaque` set by the `transport` fail to match then attempt is made against all known blobs (ones which do not belong to the current registry). * Increase the sample set of potential blob reuse to all known compressed digests , also involving the one which do not belong to current registry. * If a blob is found match it against the registry where we are attempting to push. If blob is already there consider it a `CACHE HIT!` and reply skipping blob, since its already there. How to verify this ? * Remove all images `buildah rmi --all` // needed so all new blobs can be tagged again in common bucket * Remove any previous `blob-info-cache` by ```console rm /home/<user>/.local/share/containers/cache/blob-info-cache-v1.boltdb ``` ```console $ skopeo copy docker://registry.fedoraproject.org/fedora-minimal docker://quay.io/fl/test:some-tag $ buildah pull registry.fedoraproject.org/fedora-minimal $ buildah tag registry.fedoraproject.org/fedora-minimal quay.io/fl/test $ buildah push quay.io/fl/test ``` ```console Getting image source signatures Copying blob a3497ca15bbf skipped: already exists Copying config f7e02de757 done Writing manifest to image destination Storing signatures ``` Signed-off-by: Aditya R <[email protected]>
a7cd83c
to
539f9ee
Compare
@mtrmac PTAL again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Following check is not needed and can be removed. Followup fix from containers#1645 (comment) Signed-off-by: Aditya R <[email protected]>
It seems we try to reuse blobs only for the specified registry, however we can have valid known compressed digests across registry as well following pr attempts to use that by doing following steps. * `CandidateLocations2` now processes all known blobs and appends them to returned candidates at the lowest priority. As a result when `TryReusingBlob` tries to process these candidates and if the blobs filtered by the `Opaque` set by the `transport` fail to match then attempt is made against all known blobs (ones which do not belong to the current registry). * Increase the sample set of potential blob reuse to all known compressed digests , also involving the one which do not belong to current registry. * If a blob is found match it against the registry where we are attempting to push. If blob is already there consider it a `CACHE HIT!` and reply skipping blob, since its already there. ---- ```console $ skopeo copy docker://registry.fedoraproject.org/fedora-minimal docker://quay.io/fl/test:some-tag $ buildah pull registry.fedoraproject.org/fedora-minimal $ buildah tag registry.fedoraproject.org/fedora-minimal quay.io/fl/test $ buildah push quay.io/fl/test ``` ```console Getting image source signatures Copying blob a3497ca15bbf skipped: already exists Copying config f7e02de757 done Writing manifest to image destination Storing signatures ``` Testing: containers/image#1645 Signed-off-by: Aditya R <[email protected]>
It seems we try to reuse blobs only for the specified registry, however we can have valid known compressed digests across registry as well following pr attempts to use that by doing following steps. * `CandidateLocations2` now processes all known blobs and appends them to returned candidates at the lowest priority. As a result when `TryReusingBlob` tries to process these candidates and if the blobs filtered by the `Opaque` set by the `transport` fail to match then attempt is made against all known blobs (ones which do not belong to the current registry). * Increase the sample set of potential blob reuse to all known compressed digests , also involving the one which do not belong to current registry. * If a blob is found match it against the registry where we are attempting to push. If blob is already there consider it a `CACHE HIT!` and reply skipping blob, since its already there. ---- ```console $ skopeo copy docker://registry.fedoraproject.org/fedora-minimal docker://quay.io/fl/test:some-tag $ buildah pull registry.fedoraproject.org/fedora-minimal $ buildah tag registry.fedoraproject.org/fedora-minimal quay.io/fl/test $ buildah push quay.io/fl/test ``` ```console Getting image source signatures Copying blob a3497ca15bbf skipped: already exists Copying config f7e02de757 done Writing manifest to image destination Storing signatures ``` Testing: containers/image#1645 Signed-off-by: Aditya R <[email protected]>
It seems we try to reuse blobs only for the specified registry, however we can have valid known compressed digests across registry as well following pr attempts to use that by doing following steps. * `CandidateLocations2` now processes all known blobs and appends them to returned candidates at the lowest priority. As a result when `TryReusingBlob` tries to process these candidates and if the blobs filtered by the `Opaque` set by the `transport` fail to match then attempt is made against all known blobs (ones which do not belong to the current registry). * Increase the sample set of potential blob reuse to all known compressed digests , also involving the one which do not belong to current registry. * If a blob is found match it against the registry where we are attempting to push. If blob is already there consider it a `CACHE HIT!` and reply skipping blob, since its already there. ---- ```console $ skopeo copy docker://registry.fedoraproject.org/fedora-minimal docker://quay.io/fl/test:some-tag $ buildah pull registry.fedoraproject.org/fedora-minimal $ buildah tag registry.fedoraproject.org/fedora-minimal quay.io/fl/test $ buildah push quay.io/fl/test ``` ```console Getting image source signatures Copying blob a3497ca15bbf skipped: already exists Copying config f7e02de757 done Writing manifest to image destination Storing signatures ``` Testing: containers/image#1645 Signed-off-by: Aditya R <[email protected]>
It seems we try to reuse blobs only for the specified registry, however we can have valid known compressed digests across registry as well following pr attempts to use that by doing following steps. * `CandidateLocations2` now processes all known blobs and appends them to returned candidates at the lowest priority. As a result when `TryReusingBlob` tries to process these candidates and if the blobs filtered by the `Opaque` set by the `transport` fail to match then attempt is made against all known blobs (ones which do not belong to the current registry). * Increase the sample set of potential blob reuse to all known compressed digests , also involving the one which do not belong to current registry. * If a blob is found match it against the registry where we are attempting to push. If blob is already there consider it a `CACHE HIT!` and reply skipping blob, since its already there. ---- ```console $ skopeo copy docker://registry.fedoraproject.org/fedora-minimal docker://quay.io/fl/test:some-tag $ buildah pull registry.fedoraproject.org/fedora-minimal $ buildah tag registry.fedoraproject.org/fedora-minimal quay.io/fl/test $ buildah push quay.io/fl/test ``` ```console Getting image source signatures Copying blob a3497ca15bbf skipped: already exists Copying config f7e02de757 done Writing manifest to image destination Storing signatures ``` Testing: containers/image#1645 Signed-off-by: Aditya R <[email protected]>
It seems we try to reuse blobs only for the specified registry, however we can have valid known compressed digests across registry as well following pr attempts to use that by doing following steps. * `CandidateLocations2` now processes all known blobs and appends them to returned candidates at the lowest priority. As a result when `TryReusingBlob` tries to process these candidates and if the blobs filtered by the `Opaque` set by the `transport` fail to match then attempt is made against all known blobs (ones which do not belong to the current registry). * Increase the sample set of potential blob reuse to all known compressed digests , also involving the one which do not belong to current registry. * If a blob is found match it against the registry where we are attempting to push. If blob is already there consider it a `CACHE HIT!` and reply skipping blob, since its already there. ---- ```console $ skopeo copy docker://registry.fedoraproject.org/fedora-minimal docker://quay.io/fl/test:some-tag $ buildah pull registry.fedoraproject.org/fedora-minimal $ buildah tag registry.fedoraproject.org/fedora-minimal quay.io/fl/test $ buildah push quay.io/fl/test ``` ```console Getting image source signatures Copying blob a3497ca15bbf skipped: already exists Copying config f7e02de757 done Writing manifest to image destination Storing signatures ``` Testing: containers/image#1645 Signed-off-by: Aditya R <[email protected]>
It seems we try to reuse blobs only for the specified registry, however we can have valid known compressed digests across registry as well following pr attempts to use that by doing following steps. * `CandidateLocations2` now processes all known blobs and appends them to returned candidates at the lowest priority. As a result when `TryReusingBlob` tries to process these candidates and if the blobs filtered by the `Opaque` set by the `transport` fail to match then attempt is made against all known blobs (ones which do not belong to the current registry). * Increase the sample set of potential blob reuse to all known compressed digests , also involving the one which do not belong to current registry. * If a blob is found match it against the registry where we are attempting to push. If blob is already there consider it a `CACHE HIT!` and reply skipping blob, since its already there. ---- ```console $ skopeo copy docker://registry.fedoraproject.org/fedora-minimal docker://quay.io/fl/test:some-tag $ buildah pull registry.fedoraproject.org/fedora-minimal $ buildah tag registry.fedoraproject.org/fedora-minimal quay.io/fl/test $ buildah push quay.io/fl/test ``` ```console Getting image source signatures Copying blob a3497ca15bbf skipped: already exists Copying config f7e02de757 done Writing manifest to image destination Storing signatures ``` Testing: containers/image#1645 Signed-off-by: Aditya R <[email protected]>
It seems we try to reuse blobs only for the specified registry, however
we can have valid known compressed digests across registry as well
following pr attempts to use that by doing following steps.
CandidateLocations2
now processes all known blobs and appends themto returned candidates at the lowest priority. As a result when
TryReusingBlob
tries to process these candidates and if the blobsfiltered by the
Opaque
set by thetransport
fail to match thenattempt is made against all known blobs (ones which do not belong to the
current registry).
Increase the sample set of potential blob reuse to all known
compressed digests , also involving the one which do not belong to
current registry.
If a blob is found match it against the registry where we are
attempting to push. If blob is already there consider it a
CACHE HIT!
and reply skipping blob, since its already there.How to verify this ?
Alternative to: #1461