Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API support for managing and tracking artifact manifests in image indexes #1833

Merged
merged 8 commits into from
Feb 5, 2024

Conversation

nalind
Copy link
Member

@nalind nalind commented Jan 30, 2024

This consists of multiple patches:

  • pkg/manifests.list.preferOCI(): pay attention to annotations in indexes
    Account for the presence of annotations in the OCI version of an index or list when deciding if the structure should be saved in OCI format. This requires removing an annotation from the test data so that the presence of "features" information for one of its instances will be the deciding factor.
    If we don't take those into account, setting annotations in the index doesn't force the index to be saved in OCI format, so even though none of the indicators that would force use of the Docker format are present, we effectively throw away those annotations by using Docker format anyway. I tried to skip a check on this field to avoid having to change the test data in pkg/manifests.list.preferOCI() learn about some new OCI fields #1816, but this problem forced the issue.
  • pkg/manifests.List: add SetSubject()/Subject()
    Add methods for reading and writing the index-level "subject" field. I'm not 100% sure we don't want to take an instanceDigest in these, in case a future version of the spec adds them as fields of the Descriptor type used for the items in the image index.
  • pkg/manifests.List: use pointers in SetArtifactType()/ArtifactType()
    ... so that nil can be used to mark that we want to change the field in the image index as a whole rather than on one of the manifests that we're tracking.
  • libimage/manifests.list.Add(): don't read platform info for artifacts
    If an instance being added to a list has a non-empty ArtifactType value or a config.MediaType value that isn't one of the known kinds of image configuration blobs, don't try to parse the configuration blob to figure out the image's OS/Architecture/Variant information.
  • libimage/manifests: manage artifact information, too
    Track information about artifacts for which we're managing the artifact manifests.
  • libimage/manifests.list.SaveToImage(): use ImageOptions
    When saving information about a list, use an ImageOptions structure to create the whole thing at once, and only save individual data piece by piece if we need to. Not strictly necessary, can be done later if it makes the review easier.
  • libimage/manifests.list: hold artifact manifests
    Add an AddArtifact() method which will craft an artifact manifest which references one or more files and then add that manifest to the index.
    When we need to build a reference to a list that includes our artifact manifests, save the manifests and symlinks to their "layer" blobs in a per-image directory, and add references to those locations to the list of images we can search for manifests and blobs.
  • libimage.ManifestList.Inspect(): show artifact types and file lists

@nalind nalind force-pushed the fun-with-artifacts branch 2 times, most recently from e9a45be to bb77529 Compare January 30, 2024 16:04
@rhatdan
Copy link
Member

rhatdan commented Jan 30, 2024

if err != nil {
return nil, fmt.Errorf("locating per-image directory for %s: %w", img.ID, err)
}
tmp, err := os.MkdirTemp(imgDirectory, "referenced-artifacts")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Magic string, should this be a constant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, maybe? Sure, changing it.

@nalind nalind force-pushed the fun-with-artifacts branch from bb77529 to 16bfd55 Compare January 30, 2024 21:56
@rhatdan
Copy link
Member

rhatdan commented Jan 30, 2024

LGTM
/approve
Great job.
@giuseppe @baude @mheon @umohnani8 @ashley-cui PTAL

Copy link
Contributor

openshift-ci bot commented Jan 30, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nalind, rhatdan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me overall.

(This is a not very careful review, notably particular I didn’t read the tests, and I’m not very familiar with the manifests subpackage.)

pkg/manifests/manifests.go Outdated Show resolved Hide resolved
libimage/manifests/manifests.go Show resolved Hide resolved
libimage/manifests/manifests.go Outdated Show resolved Hide resolved
libimage/manifests/manifests.go Outdated Show resolved Hide resolved
}

// Unless we were told what this is, use the default that ORAS uses.
artifactType := "application/vnd.unknown.artifact.v1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this were entirely new, I’d be fairly strongly against making a habit of not caring about MIME types.

Given pre-existing practice, oh well… I still don’t like it but I must admit there might be a good case for it.

libimage/manifests/manifests.go Outdated Show resolved Hide resolved
libimage/manifest_list.go Outdated Show resolved Hide resolved
// which refers to the named file. The name will be passed to filepath.Abs()
// before searching for an instance which references it.
func (l *list) InstanceByFile(file string) (digest.Digest, error) {
if parsedDigest, err := digest.Parse(file); err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it forbidden to use files named like that? (I am generally worried about strings that have different semantics purely based on contents, and) If so, it should be forbidden when adding the file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not explicitly, but if we misinterpret a filename as a digest, there's always the old "prepend a ./" thing people do when a file name starts with "-" option. I could easily be talked into adding a Stat() call, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure why digest-looking file names are rejected at all; is that because some heuristic string parsing anticipated to happen at the CLI level?

Either way, the immediate consequence I am worried about is that AddArtifact(…., []string{"sha256:aaa…"}), is AFAICS, currently accepted without protest, but a follow-up InstanceByFile("sha256:aaa…") refuses to look for the file. That seems surprising.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My expectation is that remembering (or copying and pasting) digests for repeated calls to buildah manifest is tiresome. I think that being able to describe which descriptor in the image index we want to modify by naming the thing that it refers might be easier.
That doesn't work as well if another index's contents were added using buildah manifest add --all, since that doesn't create a reference to the other index so much as it copies that index's descriptor list into the index being modified, and we don't get information about what those descriptors refer to that we could stash in the instances list to use this way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that being able to describe which descriptor in the image index we want to modify by naming the thing that it refers might be easier.

That makes sense to me.

I don’t immediately see how that relates to this code refusing to look for digest-named files. Is that because the CLI would try to parse the same CLI argument both ways? Even if that were the case, the way I think about it, either this is a CLI-only matter (the CLI can have that two-way heuristic, and the CLI can refuse to create digest-named files), or it is a mater explicitly supported by this lower layer, and then not only should InstanceByFile refuse to search, but AddArtifact should refuse to accept digest-named files.

Or do you mean that buildah manifest add --all would create digest-named files, so they ~need to be allowed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the related changes to the CLIs would hand the strings args they currently get to this method instead of only accepting digest values directly.

The manifest add --all bit was a digression, mainly about how this doesn't provide any help for that case, since we don't have an image name or file name to match an incoming value against, so those instances would still have to be specified by digest.

I'm not completely sold on rejecting file names that look like digests outright, since there's no ambiguity when we're first adding them, but it does seem like it would be a rare case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't have an image name or file name to match an incoming value against

That .title annotation mentioned elsewhere in this PR might work.


If the ./… approach works for searching, I can live with the code as is… Aesthetically I’m not a fan, but, *shrug*

{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"artifactType": "application/vnd.example+type",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which artifact type is supported? If I run a simple sample like this:

	storeOpts, err := cstorage.DefaultStoreOptions()
	if err != nil {
		panic(err)
	}
	store, err := cstorage.GetStore(storeOpts)
	if err != nil {
		panic(err)
	}
	runtime, err := libimage.RuntimeFromStore(store, &libimage.RuntimeOptions{})
	if err != nil {
		panic(err)
	}
	_, err = runtime.Pull(context.Background(), "quay.io/saschagrunert/seccomp:v1", commonconfig.PullPolicyAlways, &libimage.PullOptions{})
	if err != nil {
		panic(err)
	}

Then I get the panic:

panic: parsing image configuration: unsupported image-specific operation on artifact with type "application/vnd.unknown.artifact.v1"

The same for quay.io/saschagrunert/seccomp:v2:

panic: parsing image configuration: unsupported image-specific operation on artifact with type "application/vnd.example+type"

Using commit 1ad13e9 and github.com/containers/image/v5 v5.29.3-0.20240131175401-a63f4a542670

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR doesn't do anything for pulling artifacts into local storage. Its main concern is adding the ability to craft artifact manifests for local files and add them to manifest lists, and broadening support for preserving information like artifactType fields.

@nalind nalind force-pushed the fun-with-artifacts branch 5 times, most recently from e12a405 to c778520 Compare February 2, 2024 18:50
// DeepCopyDescriptor copies a Descriptor, deeply copying its contents
func DeepCopyDescriptor(original *v1.Descriptor) *v1.Descriptor {
tmp := *original
if original.URLs != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(slices.Clone(nil) == nil, so this check is not strictly necessary. Also elsewhere.

OTOH that behavior is intentional per a comment in the implementation, but not documented.)

internal/deepcopy.go Show resolved Hide resolved
libimage/manifests/manifests.go Show resolved Hide resolved
libimage/manifests/manifests.go Outdated Show resolved Hide resolved
libimage/manifests/manifests.go Outdated Show resolved Hide resolved
// which refers to the named file. The name will be passed to filepath.Abs()
// before searching for an instance which references it.
func (l *list) InstanceByFile(file string) (digest.Digest, error) {
if parsedDigest, err := digest.Parse(file); err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that being able to describe which descriptor in the image index we want to modify by naming the thing that it refers might be easier.

That makes sense to me.

I don’t immediately see how that relates to this code refusing to look for digest-named files. Is that because the CLI would try to parse the same CLI argument both ways? Even if that were the case, the way I think about it, either this is a CLI-only matter (the CLI can have that two-way heuristic, and the CLI can refuse to create digest-named files), or it is a mater explicitly supported by this lower layer, and then not only should InstanceByFile refuse to search, but AddArtifact should refuse to accept digest-named files.

Or do you mean that buildah manifest add --all would create digest-named files, so they ~need to be allowed?

@nalind nalind force-pushed the fun-with-artifacts branch from c778520 to 33726ee Compare February 2, 2024 21:21
Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (still not a very careful review, and I didn’t read the tests)

}
}

// Only set an ArtifactType for the layer if one was specified.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code described by this comment is now gone.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted the comment.

Account for the presence of annotations in the OCI version of an index
or list when deciding if the structure should be saved in OCI format.
This requires removing an annotation from the test data so that the
presence of "features" information for one of its instances will be the
deciding factor.

If we don't take those into account, setting annotations in the index
doesn't force the index to be saved in OCI format, so even though none
of the indicators that would force use of the Docker format are present,
we effectively throw away those annotations by using Docker format
anyway.

Signed-off-by: Nalin Dahyabhai <[email protected]>
Add methods for reading and writing the index-level "subject" field.

Signed-off-by: Nalin Dahyabhai <[email protected]>
... so that `nil` can be used to mark that we want to change the field
in the image index as a whole rather than on one of the manifests that
we're tracking.

This is an API change.

Signed-off-by: Nalin Dahyabhai <[email protected]>
If an instance being added to a list has a non-empty ArtifactType value
or a config.MediaType value that isn't one of the known kinds of image
configuration blobs, don't try to parse the configuration blob to figure
out the image's OS/Architecture/Variant information.

Signed-off-by: Nalin Dahyabhai <[email protected]>
Track information about artifacts for which we're managing the artifact
manifests.

Signed-off-by: Nalin Dahyabhai <[email protected]>
When saving information about a list, use an ImageOptions structure to
create the whole thing at once, and only save individual data piece by
piece if we need to.

Signed-off-by: Nalin Dahyabhai <[email protected]>
Add an AddArtifact() method which will craft an artifact manifest which
references one or more files and then add that manifest to the index.

When we need to build a reference to a list that includes our artifact
manifests, save the manifests and symlinks to their "layer" blobs along
with the contents of inlined blobs in a per-image directory, and add
references to those locations to the list of images we can search for
manifests and blobs.

Signed-off-by: Nalin Dahyabhai <[email protected]>
When listing instances in an image index, show their artifact types and
the names of any files that they're tracking.

Signed-off-by: Nalin Dahyabhai <[email protected]>
@nalind nalind force-pushed the fun-with-artifacts branch from 33726ee to af3f5ad Compare February 5, 2024 14:49
@rhatdan
Copy link
Member

rhatdan commented Feb 5, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Feb 5, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit a7be1e2 into containers:main Feb 5, 2024
7 checks passed
@nalind nalind deleted the fun-with-artifacts branch February 5, 2024 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants