Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation for minTTL #6808

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
5 changes: 5 additions & 0 deletions build/charts/antrea/conf/antrea-agent.conf
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,11 @@ kubeAPIServerOverride: {{ .Values.kubeAPIServerOverride | quote }}
# 10.96.0.10:53, [fd00:10:96::a]:53).
dnsServerOverride: {{ .Values.dnsServerOverride | quote }}

# The fqdnCacheMinTTL setting helps address the problem of applications caching DNS response IPs beyond the TTL value for the DNS record.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/The fqdnCacheMinTTL setting helps address/fqdnCacheMinTTL helps address

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hkiiita not addressed correctly, the current sentence is not grammatically correct

# It is used to enforce FQDN policy rules, ensuring that resolved IPs are included in datapath rules for as long as the application is caching them.
# This value should ideally be set to the maximum caching duration across all applications.
fqdnCacheMinTTL: {{ .Values.minTTL }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fqdnCacheMinTTL: {{ .Values.minTTL }}
fqdnCacheMinTTL: {{ .Values.fqdnCacheMinTTL }}

This is why the manifests are not generated correctly (fqdnCacheMinTTL: instead of fqdnCacheMinTTL: 0)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry , my bad. Will correct that.


# Comma-separated list of Cipher Suites. If omitted, the default Go Cipher Suites will be used.
# https://golang.org/pkg/crypto/tls/#pkg-constants
# Note that TLS1.3 Cipher Suites cannot be added to the list. But the apiserver will always
Expand Down
4 changes: 4 additions & 0 deletions build/charts/antrea/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,10 @@ kubeAPIServerOverride: ""
# -- Address of DNS server, to override the kube-dns Service. It's used to
# resolve hostnames in a FQDN policy.
dnsServerOverride: ""
# -- The minTTL setting helps address the problem of applications caching DNS response IPs indefinitely.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment out of date

# The Cluster administrators should configure this value, ideally setting it to be equal to or greater than the maximum TTL
# value of the application's DNS cache.
fqdnCacheMinTTL: 0
# -- IPv4 CIDR range used for Services. Required when AntreaProxy is disabled.
serviceCIDR: ""
# -- IPv6 CIDR range used for Services. Required when AntreaProxy is disabled.
Expand Down
9 changes: 7 additions & 2 deletions build/yamls/antrea-aks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4234,6 +4234,11 @@ data:
# 10.96.0.10:53, [fd00:10:96::a]:53).
dnsServerOverride: ""

# The fqdnCacheMinTTL setting helps address the problem of applications caching DNS response IPs beyond the TTL value for the DNS record.
# It is used to enforce FQDN policy rules, ensuring that resolved IPs are included in datapath rules for as long as the application is caching them.
# This value should ideally be set to the maximum caching duration across all applications.
fqdnCacheMinTTL:

# Comma-separated list of Cipher Suites. If omitted, the default Go Cipher Suites will be used.
# https://golang.org/pkg/crypto/tls/#pkg-constants
# Note that TLS1.3 Cipher Suites cannot be added to the list. But the apiserver will always
Expand Down Expand Up @@ -5383,7 +5388,7 @@ spec:
kubectl.kubernetes.io/default-container: antrea-agent
# Automatically restart Pods with a RollingUpdate if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: e2d1d8af083c88667ac4c22c87dea63e595b2f4f770190c32afb00c480440fe3
checksum/config: 8b260e981a71f970ab28471bcf056893615089492f917f16ee3b8d749ed6d348
labels:
app: antrea
component: antrea-agent
Expand Down Expand Up @@ -5621,7 +5626,7 @@ spec:
annotations:
# Automatically restart Pod if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: e2d1d8af083c88667ac4c22c87dea63e595b2f4f770190c32afb00c480440fe3
checksum/config: 8b260e981a71f970ab28471bcf056893615089492f917f16ee3b8d749ed6d348
labels:
app: antrea
component: antrea-controller
Expand Down
9 changes: 7 additions & 2 deletions build/yamls/antrea-eks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4234,6 +4234,11 @@ data:
# 10.96.0.10:53, [fd00:10:96::a]:53).
dnsServerOverride: ""

# The fqdnCacheMinTTL setting helps address the problem of applications caching DNS response IPs beyond the TTL value for the DNS record.
# It is used to enforce FQDN policy rules, ensuring that resolved IPs are included in datapath rules for as long as the application is caching them.
# This value should ideally be set to the maximum caching duration across all applications.
fqdnCacheMinTTL:

# Comma-separated list of Cipher Suites. If omitted, the default Go Cipher Suites will be used.
# https://golang.org/pkg/crypto/tls/#pkg-constants
# Note that TLS1.3 Cipher Suites cannot be added to the list. But the apiserver will always
Expand Down Expand Up @@ -5383,7 +5388,7 @@ spec:
kubectl.kubernetes.io/default-container: antrea-agent
# Automatically restart Pods with a RollingUpdate if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: e2d1d8af083c88667ac4c22c87dea63e595b2f4f770190c32afb00c480440fe3
checksum/config: 8b260e981a71f970ab28471bcf056893615089492f917f16ee3b8d749ed6d348
labels:
app: antrea
component: antrea-agent
Expand Down Expand Up @@ -5622,7 +5627,7 @@ spec:
annotations:
# Automatically restart Pod if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: e2d1d8af083c88667ac4c22c87dea63e595b2f4f770190c32afb00c480440fe3
checksum/config: 8b260e981a71f970ab28471bcf056893615089492f917f16ee3b8d749ed6d348
labels:
app: antrea
component: antrea-controller
Expand Down
9 changes: 7 additions & 2 deletions build/yamls/antrea-gke.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4234,6 +4234,11 @@ data:
# 10.96.0.10:53, [fd00:10:96::a]:53).
dnsServerOverride: ""

# The fqdnCacheMinTTL setting helps address the problem of applications caching DNS response IPs beyond the TTL value for the DNS record.
# It is used to enforce FQDN policy rules, ensuring that resolved IPs are included in datapath rules for as long as the application is caching them.
# This value should ideally be set to the maximum caching duration across all applications.
fqdnCacheMinTTL:

# Comma-separated list of Cipher Suites. If omitted, the default Go Cipher Suites will be used.
# https://golang.org/pkg/crypto/tls/#pkg-constants
# Note that TLS1.3 Cipher Suites cannot be added to the list. But the apiserver will always
Expand Down Expand Up @@ -5383,7 +5388,7 @@ spec:
kubectl.kubernetes.io/default-container: antrea-agent
# Automatically restart Pods with a RollingUpdate if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 7e42a403d388e2ed556d9b41f4af83917eadd0863d4e2bef67353f5adb2ef6c3
checksum/config: 96a86cbe034da4285e15a136b3c05b954b12d148eb54aeaf1a3ad543fb2588c2
labels:
app: antrea
component: antrea-agent
Expand Down Expand Up @@ -5619,7 +5624,7 @@ spec:
annotations:
# Automatically restart Pod if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 7e42a403d388e2ed556d9b41f4af83917eadd0863d4e2bef67353f5adb2ef6c3
checksum/config: 96a86cbe034da4285e15a136b3c05b954b12d148eb54aeaf1a3ad543fb2588c2
labels:
app: antrea
component: antrea-controller
Expand Down
9 changes: 7 additions & 2 deletions build/yamls/antrea-ipsec.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4247,6 +4247,11 @@ data:
# 10.96.0.10:53, [fd00:10:96::a]:53).
dnsServerOverride: ""

# The fqdnCacheMinTTL setting helps address the problem of applications caching DNS response IPs beyond the TTL value for the DNS record.
# It is used to enforce FQDN policy rules, ensuring that resolved IPs are included in datapath rules for as long as the application is caching them.
# This value should ideally be set to the maximum caching duration across all applications.
fqdnCacheMinTTL:

# Comma-separated list of Cipher Suites. If omitted, the default Go Cipher Suites will be used.
# https://golang.org/pkg/crypto/tls/#pkg-constants
# Note that TLS1.3 Cipher Suites cannot be added to the list. But the apiserver will always
Expand Down Expand Up @@ -5396,7 +5401,7 @@ spec:
kubectl.kubernetes.io/default-container: antrea-agent
# Automatically restart Pods with a RollingUpdate if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 7d8b0a065c3db85e34e127fdf38b820b32712657900e3f8fe2703d4310c40632
checksum/config: 5deeee1fbf11902f265061f60855c1720e19fb0521692c7d22e130f880947c78
checksum/ipsec-secret: d0eb9c52d0cd4311b6d252a951126bf9bea27ec05590bed8a394f0f792dcb2a4
labels:
app: antrea
Expand Down Expand Up @@ -5678,7 +5683,7 @@ spec:
annotations:
# Automatically restart Pod if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 7d8b0a065c3db85e34e127fdf38b820b32712657900e3f8fe2703d4310c40632
checksum/config: 5deeee1fbf11902f265061f60855c1720e19fb0521692c7d22e130f880947c78
labels:
app: antrea
component: antrea-controller
Expand Down
9 changes: 7 additions & 2 deletions build/yamls/antrea.yml
antoninbas marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -4234,6 +4234,11 @@ data:
# 10.96.0.10:53, [fd00:10:96::a]:53).
dnsServerOverride: ""

# The fqdnCacheMinTTL setting helps address the problem of applications caching DNS response IPs beyond the TTL value for the DNS record.
# It is used to enforce FQDN policy rules, ensuring that resolved IPs are included in datapath rules for as long as the application is caching them.
# This value should ideally be set to the maximum caching duration across all applications.
fqdnCacheMinTTL:

# Comma-separated list of Cipher Suites. If omitted, the default Go Cipher Suites will be used.
# https://golang.org/pkg/crypto/tls/#pkg-constants
# Note that TLS1.3 Cipher Suites cannot be added to the list. But the apiserver will always
Expand Down Expand Up @@ -5383,7 +5388,7 @@ spec:
kubectl.kubernetes.io/default-container: antrea-agent
# Automatically restart Pods with a RollingUpdate if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 2b4d82bcb825d50926115bad2125097f85aed424bfc49147444314cad8b7826a
checksum/config: dbebe7ad81b43b8a9e102971e323ac5ab89137efac9d4f5140c256f454ec5d66
labels:
app: antrea
component: antrea-agent
Expand Down Expand Up @@ -5619,7 +5624,7 @@ spec:
annotations:
# Automatically restart Pod if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 2b4d82bcb825d50926115bad2125097f85aed424bfc49147444314cad8b7826a
checksum/config: dbebe7ad81b43b8a9e102971e323ac5ab89137efac9d4f5140c256f454ec5d66
labels:
app: antrea
component: antrea-controller
Expand Down
1 change: 1 addition & 0 deletions cmd/antrea-agent/agent.go
Original file line number Diff line number Diff line change
Expand Up @@ -528,6 +528,7 @@ func run(o *Options) error {
nodeConfig,
podNetworkWait,
l7Reconciler,
uint32(o.config.FqdnCacheMinTTL),
)
if err != nil {
return fmt.Errorf("error creating new NetworkPolicy controller: %v", err)
Expand Down
12 changes: 9 additions & 3 deletions cmd/antrea-agent/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -155,13 +155,19 @@ func (o *Options) validate(args []string) error {
return fmt.Errorf("nodeType %s requires feature gate ExternalNode to be enabled", o.config.NodeType)
}

if o.config.NodeType == config.ExternalNode.String() {
// validate FqdnCacheMinTTL
if o.config.FqdnCacheMinTTL < 0 {
return fmt.Errorf("fqdnCacheMinTTL set to an invalid value, its must be a positive integer")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return fmt.Errorf("fqdnCacheMinTTL set to an invalid value, its must be a positive integer")
return fmt.Errorf("fqdnCacheMinTTL must be greater than or equal to 0")

}

switch o.config.NodeType {
case config.ExternalNode.String():
o.nodeType = config.ExternalNode
return o.validateExternalNodeOptions()
} else if o.config.NodeType == config.K8sNode.String() {
case config.K8sNode.String():
o.nodeType = config.K8sNode
return o.validateK8sNodeOptions()
} else {
default:
return fmt.Errorf("unsupported nodeType %s", o.config.NodeType)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a bad change, but I would avoid doing it in this PR as it is unrelated

}
}
Expand Down
8 changes: 5 additions & 3 deletions pkg/agent/controller/networkpolicy/fqdn.go
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,7 @@ type fqdnController struct {
ofClient openflow.Client
// dnsServerAddr stores the coreDNS server address, or the user provided DNS server address.
dnsServerAddr string
minTTL uint32

// dirtyRuleHandler is a callback that is run upon finding a rule out-of-sync.
dirtyRuleHandler func(string)
Expand Down Expand Up @@ -160,7 +161,7 @@ type fqdnController struct {
clock clock.Clock
}

func newFQDNController(client openflow.Client, allocator *idAllocator, dnsServerOverride string, dirtyRuleHandler func(string), v4Enabled, v6Enabled bool, gwPort uint32, clock clock.WithTicker) (*fqdnController, error) {
func newFQDNController(client openflow.Client, allocator *idAllocator, dnsServerOverride string, dirtyRuleHandler func(string), v4Enabled, v6Enabled bool, gwPort uint32, clock clock.WithTicker, fqdnCacheMinTTL uint32) (*fqdnController, error) {
controller := &fqdnController{
ofClient: client,
dirtyRuleHandler: dirtyRuleHandler,
Expand All @@ -182,6 +183,7 @@ func newFQDNController(client openflow.Client, allocator *idAllocator, dnsServer
ipv6Enabled: v6Enabled,
gwPort: gwPort,
clock: clock,
minTTL: fqdnCacheMinTTL,
}
if controller.ofClient != nil {
if err := controller.ofClient.NewDNSPacketInConjunction(dnsInterceptRuleID); err != nil {
Expand Down Expand Up @@ -643,15 +645,15 @@ func (f *fqdnController) parseDNSResponse(msg *dns.Msg) (string, map[string]ipWi
if f.ipv4Enabled {
responseIPs[r.A.String()] = ipWithExpiration{
ip: r.A,
expirationTime: currentTime.Add(time.Duration(r.Header().Ttl) * time.Second),
expirationTime: currentTime.Add(time.Duration(max(f.minTTL, r.Header().Ttl)) * time.Second),
}

}
case *dns.AAAA:
if f.ipv6Enabled {
responseIPs[r.AAAA.String()] = ipWithExpiration{
ip: r.AAAA,
expirationTime: currentTime.Add(time.Duration(r.Header().Ttl) * time.Second),
expirationTime: currentTime.Add(time.Duration(max(f.minTTL, r.Header().Ttl)) * time.Second),
}
}
}
Expand Down
1 change: 1 addition & 0 deletions pkg/agent/controller/networkpolicy/fqdn_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ func newMockFQDNController(t *testing.T, controller *gomock.Controller, dnsServe
false,
config.DefaultHostGatewayOFPort,
clockToInject,
0,
)
require.NoError(t, err)
return f, mockOFClient
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ func NewNetworkPolicyController(antreaClientGetter client.AntreaClientProvider,
gwPort, tunPort uint32,
nodeConfig *config.NodeConfig,
podNetworkWait *utilwait.Group,
l7Reconciler *l7engine.Reconciler) (*Controller, error) {
l7Reconciler *l7engine.Reconciler, fqdnCacheMinTTL uint32) (*Controller, error) {
idAllocator := newIDAllocator(asyncRuleDeleteInterval, dnsInterceptRuleID)
c := &Controller{
antreaClientProvider: antreaClientGetter,
Expand Down Expand Up @@ -227,7 +227,7 @@ func NewNetworkPolicyController(antreaClientGetter client.AntreaClientProvider,

var err error
if antreaPolicyEnabled {
if c.fqdnController, err = newFQDNController(ofClient, idAllocator, dnsServerOverride, c.enqueueRule, v4Enabled, v6Enabled, gwPort, clock.RealClock{}); err != nil {
if c.fqdnController, err = newFQDNController(ofClient, idAllocator, dnsServerOverride, c.enqueueRule, v4Enabled, v6Enabled, gwPort, clock.RealClock{}, fqdnCacheMinTTL); err != nil {
return nil, err
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,8 @@ func newTestController() (*Controller, *fake.Clientset, *mockReconciler) {
config.DefaultTunOFPort,
&config.NodeConfig{},
wait.NewGroup(),
l7reconciler)
l7reconciler,
0)
reconciler := newMockReconciler()
controller.podReconciler = reconciler
controller.auditLogger = nil
Expand Down
4 changes: 4 additions & 0 deletions pkg/config/agent/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,10 @@ type AgentConfig struct {
// Defaults to "". It must be a host string or a host:port pair of the DNS server (e.g. 10.96.0.10,
// 10.96.0.10:53, [fd00:10:96::a]:53).
DNSServerOverride string `yaml:"dnsServerOverride,omitempty"`
// The minTTL setting helps address the problem of applications caching DNS response IPs indefinitely.
// The Cluster administrators should configure this value, ideally setting it to be equal to or greater than the maximum TTL
// value of the application's DNS cache.
FqdnCacheMinTTL int `yaml:"fqdnCacheMinTTL,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the field name here should be FQDNCacheMinTTL per our conventions

// Cipher suites to use.
TLSCipherSuites string `yaml:"tlsCipherSuites,omitempty"`
// TLS min version.
Expand Down
15 changes: 9 additions & 6 deletions test/e2e/antreapolicy_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -5270,8 +5270,12 @@ func testAntreaClusterNetworkPolicyStats(t *testing.T, data *TestData) {
k8sUtils.Cleanup(namespaces)
}

// TestFQDNCacheMinTTL tests stable FQDN access for applications with cached DNS resolutions
// when FQDN NetworkPolicy are in use and the FQDN-to-IP resolution changes frequently.
// TestFQDNCacheMinTTL ensures stable FQDN access for applications that cache DNS resolutions,
// even when FQDN-to-IP mappings change frequently, and FQDN-based NetworkPolicies are in use.
// It validates the functionality of the new minTTL configuration, which is used for scenarios
// where applications may cache DNS responses beyond the TTL defined in original DNS response.
// The minTTL value enforces that resolved IPs remain in datapath rules for as long as
// applications might cache them, thereby preventing intermittent network connectivity issues to the FQDN concerned.
func TestFQDNCacheMinTTL(t *testing.T) {
antoninbas marked this conversation as resolved.
Show resolved Hide resolved
const (
testFQDN = "fqdn-test-pod.lfx.test"
Expand Down Expand Up @@ -5368,14 +5372,13 @@ func TestFQDNCacheMinTTL(t *testing.T) {
require.NoError(t, data.setPodAnnotation(data.testNamespace, "custom-dns-server", "test.antrea.io/random-value",
randSeq(8)), "failed to update custom DNS Pod annotation.")

// finally verify that Curling the previously cached IP fails after DNS update.
// finally verify that Curling the previously cached IP does not fail after DNS update.
// The wait time here should be slightly longer than the reload value specified in the custom DNS configuration.
// TODO: This assertion currently verifies the issue described in https://github.com/antrea-io/antrea/issues/6229.
// It will need to be updated once minTTL support is implemented.
// TODO: This assertion verifies the fix to the issue described in https://github.com/antrea-io/antrea/issues/6229.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is no longer a TODO

t.Logf("Trying to curl the existing cached IP of the domain: %s", fqdnIP)
assert.EventuallyWithT(t, func(t *assert.CollectT) {
_, err := curlFQDN(fqdnIP)
assert.Error(t, err)
assert.NoError(t, err)
}, 10*time.Second, 1*time.Second)
}

Expand Down
Loading