Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal encryption e2e tests #14092

Merged
merged 24 commits into from
Oct 6, 2023

Conversation

KauzClay
Copy link
Contributor

@KauzClay KauzClay commented Jun 12, 2023

Fixes #14367

Proposed Changes

  • Introduces a test that will watch logs and check for TLS connections
  • I imagine these tests can be expanded as the internal-encryption/trust config feature work continues. For example, I think logging the TLS config on a request will let us see the SAN?

Release Note

N/A

@knative-prow knative-prow bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jun 12, 2023
@knative-prow knative-prow bot added the area/API API objects and controllers label Jun 12, 2023
@knative-prow knative-prow bot added area/autoscale area/networking area/test-and-release It flags unit/e2e/conformance/perf test issues for product features labels Jun 12, 2023
@KauzClay KauzClay force-pushed the ck-internal-encryption-e2e-tests branch from a6ca6fa to ed90501 Compare June 12, 2023 16:40
@knative-prow knative-prow bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jun 12, 2023
@KauzClay KauzClay force-pushed the ck-internal-encryption-e2e-tests branch 5 times, most recently from c0c3824 to a99da3c Compare June 12, 2023 18:02
@knative-prow knative-prow bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 12, 2023
@KauzClay KauzClay force-pushed the ck-internal-encryption-e2e-tests branch from a99da3c to f83b492 Compare June 12, 2023 18:03
@codecov
Copy link

codecov bot commented Jun 12, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (3eb979a) 86.04% compared to head (3f255b2) 86.10%.

❗ Current head 3f255b2 differs from pull request most recent head bf69494. Consider uploading reports for the commit bf69494 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #14092      +/-   ##
==========================================
+ Coverage   86.04%   86.10%   +0.06%     
==========================================
  Files         196      196              
  Lines       14880    14887       +7     
==========================================
+ Hits        12803    12818      +15     
+ Misses       1764     1759       -5     
+ Partials      313      310       -3     
Files Coverage Δ
pkg/activator/config/store.go 100.00% <100.00%> (ø)

... and 8 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@KauzClay KauzClay force-pushed the ck-internal-encryption-e2e-tests branch from f83b492 to 380eb72 Compare June 12, 2023 18:04
@KauzClay
Copy link
Contributor Author

@evankanderson this is an attempt at the idea you shared during a past Security WG meeting.

@knative-prow knative-prow bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 13, 2023
@KauzClay KauzClay force-pushed the ck-internal-encryption-e2e-tests branch 2 times, most recently from 73c1f11 to db3ca6f Compare June 13, 2023 14:34
Copy link
Member

@evankanderson evankanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial thoughts; overall, this looks pretty good, but I'm going to push to simplify a few parts and complexify a few others. 😁

Comment on lines 57 to 59
if isTLS {
pkgmetrics.Record(reporterCtx, tlsConnectionCountM.M(1))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than reporting a second metric, how about adding a tag to the existing metric, ala:

securityMode := // get security mode from context
reporterCtx, _ := tag.New(reporterCtx, tag.Upsert(SecurityMode, securityMode))

That will also provide a tag which you can use to check the mutual-TLS mode when it's implemented, and doesn't need to expose an extra metric.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah cool, I didn't know about tags. That makes sense, this will be easier to expand when the rest of the trust levels are implemented.

I'll make a first attempt at this where I pass the trustmode into the MetricHandler creation, instead of pulling it from Context.

As far as I understand, the activator doesn't have a watcher for the config-network configmap, so I don't think you can pull it from the context. And I'm less confident about changing that right now.

Copy link
Contributor Author

@KauzClay KauzClay Aug 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, now it does have that ability, so I can probably pull from context now: #13694

nevermind, that is about certs, not the configuration itself

Description: "The number of requests that the Activator handles with TLS",
Measure: tlsConnectionCountM,
Aggregation: view.Count(),
TagKeys: []tag.Key{metrics.PodKey, metrics.ContainerKey, metrics.ResponseCodeKey, metrics.ResponseCodeClassKey},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you take my tagging suggestion, You'd need to add the SecurityMode to requestCountM. I'd add it to responseTimeInMsecM as well (and maybe requestConcurrencyM), to allow comparing request latency for TLS and non-TLS if we decide to do that some day.

Comment on lines 17 to 72
source $(dirname $0)/e2e-common.sh

function setup_internal_encryption_env_variables() {
export TLS_TEST_NAMESPACE="tls"

local INGRESS_NAMESPACE=${GATEWAY_NAMESPACE_OVERRIDE}
if [[ -z "${GATEWAY_NAMESPACE_OVERRIDE}" ]]; then
INGRESS_NAMESPACE="istio-system"
fi
local INGRESS_SERVICE=${GATEWAY_OVERRIDE}
if [[ -z "${GATEWAY_OVERRIDE}" ]]; then
INGRESS_SERVICE="istio-ingressgateway"
fi
local IP=$(kubectl get svc -n ${INGRESS_NAMESPACE} ${INGRESS_SERVICE} -o jsonpath="{.status.loadBalancer.ingress[0].ip}")
export INTERNAL_ENCRYPTION_TEST_INGRESS_IP=${IP}
}

function setup_internal_encryption() {
toggle_feature dataplane-trust enabled config-network

sleep 5

# with the current implementation, Activator is always in the request path, and needs to be restarted after configuring dataplane-trust
kubectl -n ${SYSTEM_NAMESPACE} delete pod -l app=activator
}

function cleanup_internal_encryption() {
toggle_feature dataplane-trust disabled config-network

sleep 5

# with the current implementation, Activator is always in the request path, and needs to be restarted after configuring dataplane-trust
kubectl -n ${SYSTEM_NAMESPACE} delete pod -l app=activator
}

# Script entry point.
initialize "$@" --skip-istio-addon --min-nodes=4 --max-nodes=4 --enable-ha --cluster-version=1.25

# Run the tests
header "Running tests"

failed=0

# Currently only Contour and Kourier implement the alpha features.
alpha=""
if [[ "${INGRESS_CLASS}" == "contour.ingress.networking.knative.dev" || "${INGRESS_CLASS}" == "kourier.ingress.networking.knative.dev" ]]; then
alpha="--enable-alpha"
fi

INTERNAL_ENCRYPTION_TEST_OPTIONS="${INTERNAL_ENCRYPTION_TEST_OPTIONS:-${alpha} --enable-beta}"

# Auto TLS E2E tests mutate the cluster and must be ran separately
# because they need auto-tls and cert-manager specific configurations
subheader "Setup internal encryption"
setup_internal_encryption
add_trap "cleanup_internal_encryption" EXIT SIGKILL SIGTERM SIGQUIT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eeep... any way this could be less shell, or have a little block explaining why it's shell?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Maybe "see the README" is sufficient...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah, this part was one I wasn't so sure on. I was under the impression you needed a script like this if you needed to change a config on the test cluster (like setting dataplane-trust), so I just tried to base things on the auto-tls tests.

But I'll be honest, I don't completely understand how that script ties into the rest of the tooling so that the tests actually run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, I got a little clarification, and I think I can move the core behavior here into the e2e-tests.sh file, and get rid of the file itself.

test/e2e/internalencryption/README.md Outdated Show resolved Hide resolved

In order to test Internal Encryption, this test looks at `tls_connection_count` metrics from the Activator and QueueProxy.

The `metricsreader` test image [here](../../test_images/metricsreader/README.md) was created for this purpose. Given the PodIPs of the Activator and the Knative Service pod (i.e. the QueueProxy), it will make requests to each respective `/metrics` endpoint, pull out the `*_tls_connection_count` metric, and respond with the value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can metricsreader find the pods using label queries and/or DNS queries (if the services are headless services -- I don't know if / how that would perturb the rest of Knative, though.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I'm not sure I'm following. Are you suggesting that the metricsreader image itself gets the IPs, rather than gathering that info as part of the test code?

Right now, I have the test getting that info, since it seemed easier to test and make sure everything was up and ready. My initial intention was to make the metricsreader image do as little as possible.

@KauzClay
Copy link
Contributor Author

KauzClay commented Oct 5, 2023

/test kourier-stable

@KauzClay KauzClay force-pushed the ck-internal-encryption-e2e-tests branch from 3f255b2 to bf69494 Compare October 5, 2023 15:15
@nak3
Copy link
Contributor

nak3 commented Oct 6, 2023

/retest

@nak3
Copy link
Contributor

nak3 commented Oct 6, 2023

/test kourier-stable

@nak3
Copy link
Contributor

nak3 commented Oct 6, 2023

/lgtm
/approve

Thank you 🎉

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Oct 6, 2023
@knative-prow
Copy link

knative-prow bot commented Oct 6, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: KauzClay, nak3

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 6, 2023
@nak3
Copy link
Contributor

nak3 commented Oct 6, 2023

/retest

@knative-prow knative-prow bot merged commit c183543 into knative:main Oct 6, 2023
62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/API API objects and controllers area/autoscale area/monitoring area/networking area/test-and-release It flags unit/e2e/conformance/perf test issues for product features lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add e2e tests for internal-encryption enabled to Serving
5 participants