plugins: add sgx-epc plugin. #156

klihub · 2023-10-16T12:16:23Z

plugins: add sgx-epc plugin.

Add a plugin for limiting SGX encrypted page cache usage with
pod annotations. Note that the plugin requires a patched cgroup
v2 misc controller with support for something like
echo 'sgx_epc 65536' > /sys/fs/cgroup/$CGRP/misc.max
to work.

klihub · 2023-10-16T12:24:34Z

@mythi Here is the sgx-config. It is the original NRI PoC sample plugin. The code is updated for latest NRI main HEAD, retrofitted to this repo, and extended with the necessary bits for image building and a kustomizable Helm chart for easier deployment. PTAL.

askervin

This is as clean as it gets with the current stuff.

Maybe the name of the plugin could be just nri-sgx, if we can see this plugin to be extended to handle other SGX properties later on. Or nri-sgx-epc it's going to be EPC and nothing else.

The current project structure makes you write three daemonset specs for a single plugin. I guess we could make this somehow simpler, but did not get any good ideas when writing memory plugins. Yet another layer of templating on top of templating did not feel good idea at that point, but this repetition with slight differences is not very nice either.

cmd/plugins/sgx-config/nri-sgx-config-deployment.yaml.in

deployment/helm/sgx-config/values.yaml

[release-0.2]: cherry-pick #156: fix image tagging scheme in publishing workflow.

cmd/plugins/sgx-config/sgx-config.go

docs/memory/sgx-config.md

klihub · 2023-10-18T20:09:02Z

The current project structure makes you write three daemonset specs for a single plugin. I guess we could make this somehow simpler, but did not get any good ideas when writing memory plugins. Yet another layer of templating on top of templating did not feel good idea at that point, but this repetition with slight differences is not very nice either.

@askervin I think we should be able to get this down to one per plugin, the one in the helm chart templates, with relative ease. We have now extra copies to be able to wire up plugins differently for e2e tests which tend to use config files and not ConfigMaps. I think we should (be able to) change this once we reworked the configuration handling bits (IOW, manage to iron out the worst monkey puke from #164 and add the missing bits for e2e test there). That will get us rid of the fallback configuration file altogether.

At that point it would be nice to use helm and a local unpacked helm chart directory for e2e tests, too (and make sure that all the necessary {k,c}ustomizability and other templating is correctly in place to support the e2e test cases).

klihub · 2023-10-18T20:12:13Z

I think the SGX EPC NRI plugin should also have an UpdateContainer method (similar to the already implemented CreateContainer method).

The use case for this is that a user defines an EPC limit for a pod after it has already been created. Or a user updates a defined limit because it was too low/high.

So does that really work already without recreating the whole pod and container ? With in-place vertical pod scaling ?

mythi · 2023-10-19T05:51:01Z

I think the SGX EPC NRI plugin should also have an UpdateContainer method (similar to the already implemented CreateContainer method).
The use case for this is that a user defines an EPC limit for a pod after it has already been created. Or a user updates a defined limit because it was too low/high.

So does that really work already without recreating the whole pod and container ? With in-place vertical pod scaling ?

To me it looks this is not possible right now. I tried patching sgx.intel.com/epc and I get an error:

* spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`,`spec.initContainers[*].image`,`spec.activeDeadlineSeconds`,`spec.tolerations` (only additions to existing tolerations),`spec.terminationGracePeriodSeconds` (allow it to be set to 1 if it was previously negative)

ffuerste · 2023-10-19T05:54:38Z

I think the SGX EPC NRI plugin should also have an UpdateContainer method (similar to the already implemented CreateContainer method).
The use case for this is that a user defines an EPC limit for a pod after it has already been created. Or a user updates a defined limit because it was too low/high.

So does that really work already without recreating the whole pod and container ? With in-place vertical pod scaling ?

After thinking it completely through, I don't think so. If the annotations change on the pod level (e.g. in .spec.template.metadata.annotations for a Deployment) then Kubernetes is creating a new ReplicaSet. I forgot that when I was writing, sorry. So please ignore my comment. I also deleted it.

mythi · 2023-10-19T07:10:01Z

@ffuerste I think that would be a reasonable feature for the future if we can get rid of the annotations and in-place autoscaling can be made to work etc but for know I'd keep the focus on the CreateContainer flows

askervin

LGTM.

docs/memory/sgx-epc.md

docs/deployment/sgx-epc.md

Add a plugin for limiting SGX encrypted page cache usage with pod annotations. Note that the plugin requires a patched cgroup v2 misc controller with support for something like echo 'sgx_epc 65536' > /sys/fs/cgroup/$CGRP/misc.max to work. Signed-off-by: Krisztian Litkey <[email protected]>

marquiz

Thanks @klihub
LGTM

askervin

A typo... otherwise LGTM.

askervin · 2023-10-23T07:53:08Z

deployment/helm/sgx-epc/README.md

+
+## Installing the Chart
+
+Path to the chart: `nri-sg-epc`.


s/nri-sg-epc/nri-sgx-epc

mythi · 2023-10-23T07:56:40Z

docs/memory/sgx-epc.md

+    # for all containers in the pod
+    epc-limit.nri.io/pod: "32768"
+    # alternative notation for all containers in the pod
+    epc-limit.nri.io: "8192"


users are expected to set per-container sgx.intel.com/epc limits which we'd take to epc-limit.nri.io/container.c0 annotations. Are these about pod-level misc setting (e.g., a sum of all per-container requests within the pod)?

No, the '/pod' (and its alternative) notation applies to any container in the pod. It's simply a shorthand notation that can be used instead of adding for every container CTR in the pod a container-specific epc-limit.nri.io/container.$CTR: $limit.

OK, hmm. There's probably no use for that functionality.

I could think of a case where we'd allow a policy that takes the sum of epc resource requests and adds that to a sandbox level shared resource...

klihub force-pushed the devel/sgx-config-plugin branch from 2719f2e to 8b8e77e Compare October 16, 2023 12:17

klihub force-pushed the devel/sgx-config-plugin branch from 8b8e77e to 25a3dbc Compare October 16, 2023 12:26

mythi mentioned this pull request Oct 16, 2023

RFC: SGX EPC limits enforcement using NRI to configure misc controller intel/intel-device-plugins-for-kubernetes#1567

Open

10 tasks

askervin reviewed Oct 17, 2023

View reviewed changes

cmd/plugins/sgx-config/nri-sgx-config-deployment.yaml.in Outdated Show resolved Hide resolved

deployment/helm/sgx-config/values.yaml Outdated Show resolved Hide resolved

fmuyassarov added a commit that referenced this pull request Oct 17, 2023

Merge pull request #160 from klihub/release-0.2

320e5d9

[release-0.2]: cherry-pick #156: fix image tagging scheme in publishing workflow.

ffuerste reviewed Oct 18, 2023

View reviewed changes

cmd/plugins/sgx-config/sgx-config.go Outdated Show resolved Hide resolved

docs/memory/sgx-config.md Outdated Show resolved Hide resolved

klihub force-pushed the devel/sgx-config-plugin branch from 25a3dbc to d25d66e Compare October 18, 2023 20:09

klihub changed the title ~~plugins: add sgx-config plugin.~~ plugins: add sgx-epc plugin. Oct 18, 2023

klihub force-pushed the devel/sgx-config-plugin branch from d25d66e to 66c8f29 Compare October 20, 2023 14:25

klihub marked this pull request as ready for review October 20, 2023 14:25

klihub force-pushed the devel/sgx-config-plugin branch from 66c8f29 to 5470767 Compare October 20, 2023 14:39

askervin approved these changes Oct 23, 2023

View reviewed changes

marquiz requested changes Oct 23, 2023

View reviewed changes

docs/memory/sgx-epc.md Outdated Show resolved Hide resolved

docs/deployment/sgx-epc.md Show resolved Hide resolved

klihub force-pushed the devel/sgx-config-plugin branch from 5470767 to cf79983 Compare October 23, 2023 07:31

klihub requested a review from marquiz October 23, 2023 07:33

marquiz approved these changes Oct 23, 2023

View reviewed changes

marquiz merged commit e09ca9b into containers:main Oct 23, 2023
3 checks passed

askervin reviewed Oct 23, 2023

View reviewed changes

deployment/helm/sgx-epc/README.md

## Installing the Chart

Path to the chart: `nri-sg-epc`.

Copy link

Collaborator

askervin Oct 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/nri-sg-epc/nri-sgx-epc

mythi reviewed Oct 23, 2023

View reviewed changes

klihub deleted the devel/sgx-config-plugin branch October 23, 2023 12:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

plugins: add sgx-epc plugin. #156

plugins: add sgx-epc plugin. #156

klihub commented Oct 16, 2023 •

edited

Loading

klihub commented Oct 16, 2023

askervin left a comment

klihub commented Oct 18, 2023 •

edited

Loading

klihub commented Oct 18, 2023 •

edited

Loading

mythi commented Oct 19, 2023

ffuerste commented Oct 19, 2023

mythi commented Oct 19, 2023

askervin left a comment

marquiz left a comment

askervin left a comment

askervin Oct 23, 2023

mythi Oct 23, 2023

klihub Oct 23, 2023 •

edited

Loading

mythi Oct 24, 2023

plugins: add sgx-epc plugin. #156

plugins: add sgx-epc plugin. #156

Conversation

klihub commented Oct 16, 2023 • edited Loading

klihub commented Oct 16, 2023

askervin left a comment

Choose a reason for hiding this comment

klihub commented Oct 18, 2023 • edited Loading

klihub commented Oct 18, 2023 • edited Loading

mythi commented Oct 19, 2023

ffuerste commented Oct 19, 2023

mythi commented Oct 19, 2023

askervin left a comment

Choose a reason for hiding this comment

marquiz left a comment

Choose a reason for hiding this comment

askervin left a comment

Choose a reason for hiding this comment

askervin Oct 23, 2023

Choose a reason for hiding this comment

mythi Oct 23, 2023

Choose a reason for hiding this comment

klihub Oct 23, 2023 • edited Loading

Choose a reason for hiding this comment

mythi Oct 24, 2023

Choose a reason for hiding this comment

klihub commented Oct 16, 2023 •

edited

Loading

klihub commented Oct 18, 2023 •

edited

Loading

klihub commented Oct 18, 2023 •

edited

Loading

klihub Oct 23, 2023 •

edited

Loading