Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image decryption fails with current operator version #1519

Closed
surajssd opened this issue Oct 12, 2023 · 8 comments
Closed

Image decryption fails with current operator version #1519

surajssd opened this issue Oct 12, 2023 · 8 comments
Labels
bug Something isn't working core Issues related to the core adaptor code

Comments

@surajssd
Copy link
Member

Currently the image decryption fails with the following error, when I deploy CAA with today's main branch changes:

$ kubectl get events 
...
45m         Warning   Failed                      pod/busybox-cc-84767f675d-5hvd4          Failed to pull image "quay.io/surajd/busybox-encrypted:2023-10-Oct-12-14-31-46": rpc error: code = Unknown desc = failed to pull and unpack image "quay.io/surajd/busybox-encrypted:2023-10-Oct-12-14-31-46": failed to extract layer sha256:3d24ee258efc3bfe4066a1a9fb83febf6dc0b1548dfe896161533668281c9f4f: failed to get stream processor for application/vnd.oci.image.layer.v1.tar+gzip+encrypted: ctd-decoder resolves to executable in current directory (./ctd-decoder): unknown
...

The error says: Failed to pull image "quay.io/surajd/busybox-encrypted:2023-10-Oct-12-14-31-46": rpc error: code = Unknown desc = failed to pull and unpack image "quay.io/surajd/busybox-encrypted:2023-10-Oct-12-14-31-46": failed to extract layer sha256:3d24ee258efc3bfe4066a1a9fb83febf6dc0b1548dfe896161533668281c9f4f: failed to get stream processor for application/vnd.oci.image.layer.v1.tar+gzip+encrypted: ctd-decoder resolves to executable in current directory (./ctd-decoder): unknown.

On the kata-agent side I get 404 error saying that the key is not available even though I can get the key at runtime, while I get the following error I don't see anything logged on the KBS side. KBS logs every request that reaches it:

$ sudo journalctl -u kata-agent
...
Oct 12 14:57:54 podvm-busybox-cc-84767f675d-7qv6g-446ff999 kata-agent[961]: [2023-10-12T14:57:54Z ERROR attestation_agent::rpc::getresource::ttrpc] Call AA-KBC to get resource failed: KBS resource not found: KBS resource Not Found (Error 404): ErrorInformation {
Oct 12 14:57:54 podvm-busybox-cc-84767f675d-7qv6g-446ff999 kata-agent[961]:         error_type: "https://github.com/confidential-containers/kbs/errors/ReadSecretFailed",
Oct 12 14:57:54 podvm-busybox-cc-84767f675d-7qv6g-446ff999 kata-agent[961]:         detail: "Read secret failed: read resource from local fs",
Oct 12 14:57:54 podvm-busybox-cc-84767f675d-7qv6g-446ff999 kata-agent[961]:     }
...

I can get the key at runtime from within the podvm:

$ sudo ip netns exec podns curl http://127.0.0.1:8006/cdh/resource/default/image-decryption-keys/key.bin && echo
F~zA(<XȎyY}tew

It is the same key stored as secret in KBS:

$ kubectl -n coco-tenant get secret image-decryption-key-4f85hb82db -ojsonpath='{.data.key\.bin}' | base64 -d && echo
F~zA(<XȎyY}tew

The image is encrypted fine and has all the image encryption information:

$ skopeo inspect docker://quay.io/surajd/busybox-encrypted:2023-10-Oct-12-14-31-46 | jq -r '.LayersData[0].Annotations."org.opencontainers.image.enc.keys.provider.attestation-agent"' | base64 -d | jq
{
  "kid": "kbs:///default/image-decryption-keys/key.bin",
  "wrapped_data": "W7mhXS/gSFQNTegB6QaKPIQzAXBSYeRqBTF7rAZoQ/j7730mCg/ZdIZwnD0I5XsyQAOjj4/xylnG76TNDsJoJkaRByI3bVXQpR5a5/w71ChA1j8PuHo095n6qzbqydBhyjSm8rHafk3ksyrPOBj0ZnIw4W4QNOaM3GvUOLcIcrhr935Cx4JoOrGcaY/+tesHcg0H1lsPe/GnZTSN6zRjPkSPUYN4y8p2qrXBi9Ewpbo3FBV2FmPtvcOteXRpzwoEtokN0U7dRRhcfmpwj+Kxs/s=",
  "iv": "c0yZlQW0DQpOHWwA",
  "wrap_type": "A256GCM"
}

I was able to get the image decryption working with the following changes done to the operator and getting a new node:

kubectl -n confidential-containers-system set image ds/cc-operator-daemon-install cc-runtime-install-pod=quay.io/confidential-containers/runtime-payload-ci@sha256:81a987e4b3144c2dbada9e71911b46d47fab82acf754b1dad416f044c770abe5 
kubectl -n confidential-containers-system set image ds/cc-operator-pre-install-daemon cc-runtime-pre-install-pod=quay.io/confidential-containers/reqs-payload@sha256:d4928a82a6a62119163fb1a0b7bb022f9e02a345405e534cebaee480179caa22
kubectl -n confidential-containers-system set image deploy/cc-operator-controller-manager manager=quay.io/confidential-containers/operator@sha256:4131275630cf95727f75e72eb301a459b3a5e596bdc3c8bb668812d7bbae74a1
@surajssd surajssd added bug Something isn't working core Issues related to the core adaptor code labels Oct 12, 2023
@surajssd
Copy link
Member Author

surajssd commented Oct 12, 2023

cc: @mkulke @stevenhorsman @bpradipt

@surajssd
Copy link
Member Author

Also pinging @huoqifeng

@surajssd
Copy link
Member Author

surajssd added a commit to surajssd/kubeconNA23-demo that referenced this issue Oct 12, 2023
This commit uses components that have worked with encrypted images. The
issue is tracked in the upstream:
confidential-containers/cloud-api-adaptor#1519.

Signed-off-by: Suraj Deshmukh <[email protected]>
@surajssd
Copy link
Member Author

To fix the problematic set up installed by default, I am following these steps:

# Install operator
# Clone operator repository
git clone https://github.com/confidential-containers/operator
pushd operator

pushd config/manager
kustomize edit set image controller=quay.io/confidential-containers/operator@sha256:4131275630cf95727f75e72eb301a459b3a5e596bdc3c8bb668812d7bbae74a1
popd

kubectl apply -k config/default

pushd config/samples/ccruntime/peer-pods
kustomize edit set image quay.io/confidential-containers/reqs-payload=quay.io/confidential-containers/reqs-payload:e45d4e84c3ce4ae116f3f4d6c123c4829606026f
kustomize edit set image quay.io/confidential-containers/runtime-payload=quay.io/confidential-containers/runtime-payload-ci:kata-containers-7ee7ca2b31915a6e4ad54dbe61b2c06dee24e598
popd

kubectl apply -k config/samples/ccruntime/peer-pods
popd

# Install CAA
kubectl apply -k install/overlays/${CLOUD_PROVIDER}

# Now the operator will start installing components
kubectl label nodes --all node.kubernetes.io/worker=

surajssd added a commit to surajssd/kubeconNA23-demo that referenced this issue Oct 12, 2023
This commit uses components that have worked with encrypted images. The
issue is tracked in the upstream:
confidential-containers/cloud-api-adaptor#1519.

Signed-off-by: Suraj Deshmukh <[email protected]>
@stevenhorsman
Copy link
Member

stevenhorsman commented Oct 13, 2023

Hey Suraj, what do you mean by things failing with the "current operator"? The main version of the operator is broken (See the daily baseline): http://jenkins.katacontainers.io/view/Daily%20CCv0%20baseline/ and doesn't support the pull on host with nydus snapshotter which was merged into kata-containers around 2.5 weeks ago, so I'm not too surprised that you need to pick up older versions of the payloads before this to get it working. QiFeng has been testing Fabiano's WIP PR confidential-containers/operator#263 build of it that adds support for these things IIUC. We're hoping to get it merged soon-ish, but have found a couple of other blockers that I'm working to resolve first. I'm not sure if anyone has tested it with CC-kbc though, as is, so it might not be in a working state.

@surajssd
Copy link
Member Author

doesn't support the pull on host with nydus snapshotter

In our case with peerpods it is always pull on the podvm / guest, isn't it?

@stevenhorsman
Copy link
Member

In our case with peerpods it is always pull on the podvm / guest, isn't it?

Yes, so we're in a bit of an undefined state where we don't have nydus snaphotter pull on guest support working with the operator, but there is a chance that the forked version of containerd isn't fully set-up, so I'm not sure the latest operator can be relied on.

surajssd added a commit to surajssd/kubeconNA23-demo that referenced this issue Oct 30, 2023
This commit uses components that have worked with encrypted images. The
issue is tracked in the upstream:
confidential-containers/cloud-api-adaptor#1519.

Signed-off-by: Suraj Deshmukh <[email protected]>
@mkulke
Copy link
Collaborator

mkulke commented Nov 13, 2023

nydus-snapshotter is working now, tested w/ main (d4496d0) and /CommunityGalleries/cocopodvm-d0e4f35f-5530-4b9c-8596-112487cdea85/images/podvm_image0/versions/2023.11.13 podvm image on azure, with encrypted and unencrypted images.

@mkulke mkulke closed this as completed Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core Issues related to the core adaptor code
Projects
None yet
Development

No branches or pull requests

3 participants