Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embedded load-balancer behavior is flakey and hard to understand #11334

Closed
brandond opened this issue Nov 17, 2024 · 2 comments
Closed

Embedded load-balancer behavior is flakey and hard to understand #11334

brandond opened this issue Nov 17, 2024 · 2 comments
Assignees
Labels
kind/enhancement An improvement to existing functionality

Comments

@brandond
Copy link
Member

brandond commented Nov 17, 2024

The loadbalancer server list is a bit of a mess. its behavior has been tinkered with a lot over the last year, but it's still hard to reason about. This has caused a spate of issues:

From a code perspective, the loadbalancer state is directly accessed by a number of functions that all poke at various index vars, current and default server name vars, a list of server addresses, another RANDOM list of server addresses, and a map of addresses to structs that hold state:

serviceName string
configFile string
localAddress string
localServerURL string
defaultServerAddress string
ServerURL string
ServerAddresses []string
randomServers []string
servers map[string]*server
currentServerAddress string
nextServerIndex int

The DialContext function is called whenever a new connection comes in, and holds a read lock while iterating (possibly twice) over the random server list, and servers may be added or removed at any time. The code is VERY hard to read and understand, given the number of variables involved:

var allChecksFailed bool
startIndex := lb.nextServerIndex
for {
targetServer := lb.currentServerAddress
server := lb.servers[targetServer]
if server == nil || targetServer == "" {
logrus.Debugf("Nil server for load balancer %s: %s", lb.serviceName, targetServer)
} else if allChecksFailed || server.healthCheck() {
dialTime := time.Now()
conn, err := server.dialContext(ctx, network, targetServer)
if err == nil {
return conn, nil
}
logrus.Debugf("Dial error from load balancer %s after %s: %s", lb.serviceName, time.Now().Sub(dialTime), err)
// Don't close connections to the failed server if we're retrying with health checks ignored.
// We don't want to disrupt active connections if it is unlikely they will have anywhere to go.
if !allChecksFailed {
defer server.closeAll()
}
} else {
logrus.Debugf("Dial health check failed for %s", targetServer)
}
newServer, err := lb.nextServer(targetServer)
if err != nil {
return nil, err
}
if targetServer != newServer {
logrus.Debugf("Failed over to new server for load balancer %s: %s -> %s", lb.serviceName, targetServer, newServer)
}
if ctx.Err() != nil {
return nil, ctx.Err()
}
maxIndex := len(lb.randomServers)
if startIndex > maxIndex {
startIndex = maxIndex
}
if lb.nextServerIndex == startIndex {
if allChecksFailed {
return nil, errors.New("all servers failed")
}
logrus.Debugf("Health checks for all servers in load balancer %s have failed: retrying with health checks ignored", lb.serviceName)
allChecksFailed = true
}
}

We should simplify the load-balancer behavior so that it functions more reliably, and its functionality is easier to understand and explain.

@ShylajaDevadiga
Copy link
Contributor

ShylajaDevadiga commented Dec 11, 2024

Tests to cover

  1. Create cluster with 3etcd, 2cp and 1 agent node. Validate lb, ingress functionality
  2. Restart k3s on all nodes
  3. Stop and start one node
  4. Restart CP nodes in reverse order
  5. Reboot all nodes
  6. Delete one node, validate functionality
  7. Delete node and add new node, validate functionality
  8. Stop 2 nodes and check logs in the agent node

@ShylajaDevadiga
Copy link
Contributor

Validated above scenarios using k3s version v1.31.3+k3s-c88e217f

k3s -v
k3s version v1.31.3+k3s-c88e217f (c88e217f)
go version go1.22.8

> kubectl get node
NAME                                          STATUS   ROLES                       AGE   VERSION
ip-172-31-12-199.us-east-2.compute.internal   Ready    control-plane,etcd,master   16h   v1.31.3+k3s-c88e217f
ip-172-31-13-86.us-east-2.compute.internal    Ready    control-plane,etcd,master   16h   v1.31.3+k3s-c88e217f
ip-172-31-15-89.us-east-2.compute.internal    Ready    <none>                      16h   v1.31.3+k3s-c88e217f
ip-172-31-5-210.us-east-2.compute.internal    Ready    control-plane,etcd,master   16h   v1.31.3+k3s-c88e217f

Terminated and then deleted a node

> kubectl get nodes
NAME                                          STATUS     ROLES                       AGE   VERSION
ip-172-31-12-199.us-east-2.compute.internal   Ready      control-plane,etcd,master   18h   v1.31.3+k3s-c88e217f
ip-172-31-13-86.us-east-2.compute.internal    Ready      control-plane,etcd,master   18h   v1.31.3+k3s-c88e217f
ip-172-31-15-89.us-east-2.compute.internal    Ready      <none>                      18h   v1.31.3+k3s-c88e217f
ip-172-31-5-210.us-east-2.compute.internal    NotReady   control-plane,etcd,master   18h   v1.31.3+k3s-c88e217f
> kubectl delete node ip-172-31-5-210.us-east-2.compute.internal
node "ip-172-31-5-210.us-east-2.compute.internal" deleted
ec2-user@ip-172-31-12-199:~> kubectl get nodes
NAME                                          STATUS   ROLES                       AGE   VERSION
ip-172-31-12-199.us-east-2.compute.internal   Ready    control-plane,etcd,master   23h   v1.31.3+k3s-c88e217f
ip-172-31-13-86.us-east-2.compute.internal    Ready    control-plane,etcd,master   23h   v1.31.3+k3s-c88e217f
ip-172-31-15-89.us-east-2.compute.internal    Ready    <none>                      23h   v1.31.3+k3s-c88e217f

Added a new node

> kubectl get nodes
NAME                                          STATUS   ROLES                       AGE   VERSION
ip-172-31-12-199.us-east-2.compute.internal   Ready    control-plane,etcd,master   23h   v1.31.3+k3s-c88e217f
ip-172-31-13-86.us-east-2.compute.internal    Ready    control-plane,etcd,master   23h   v1.31.3+k3s-c88e217f
ip-172-31-15-89.us-east-2.compute.internal    Ready    <none>                      23h   v1.31.3+k3s-c88e217f
ip-172-31-3-142.us-east-2.compute.internal    Ready    control-plane,etcd,master                        59m    v1.31.3+k3s-c88e217f

Agent logs

Dec 13 00:59:23 ip-172-31-15-89 k3s[1585]: time="2024-12-13T00:59:23Z" level=info msg="Updated load balancer k3s-agent-load-balancer server addresses -> [1.1.1.8:6443 2.2.2.215:6443 3.3.3.44:6443] [default: 1.1.1.8:6443]"
Dec 13 00:59:23 ip-172-31-15-89 k3s[1585]: time="2024-12-13T00:59:23Z" level=info msg="Stopped tunnel to 3.137.211.32:6443"
Dec 13 00:59:40 ip-172-31-15-89 k3s[1585]: I1213 00:59:40.959724    1585 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.42.2.0/24]
Dec 13 00:59:40 ip-172-31-15-89 k3s[1585]: I1213 00:59:40.960365    1585 subnet.go:152] Batch elem [0] is { lease.Event{Type:1, Lease:lease.Lease{EnableIPv4:true, EnableIPv6:false, Subnet:ip.IP4Net{IP:0xa2a0200, PrefixLen:0x18}, IPv6Subnet:ip.IP6Net{IP:(*ip.IP6)(nil), PrefixLen:0x0}, Attrs:lease.LeaseAttrs{PublicIP:0xac1f038e, PublicIPv6:(*ip.IP6)(nil), BackendType:"vxlan", BackendData:json.RawMessage{0x7b, 0x22, 0x56, 0x4e, 0x49, 0x22, 0x3a, 0x31, 0x2c, 0x22, 0x56, 0x74, 0x65, 0x70, 0x4d, 0x41, 0x43, 0x22, 0x3a, 0x22, 0x38, 0x61, 0x3a, 0x36, 0x63, 0x3a, 0x37, 0x37, 0x3a, 0x31, 0x33, 0x3a, 0x65, 0x34, 0x3a, 0x62, 0x39, 0x22, 0x7d}, BackendV6Data:json.RawMessage(nil)}, Expiration:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Asof:0}} }
Dec 13 00:59:40 ip-172-31-15-89 k3s[1585]: I1213 00:59:40.960452    1585 vxlan_network.go:100] Received Subnet Event with VxLan: BackendType: vxlan, PublicIP: 172.31.3.142, PublicIPv6: (nil), BackendData: {"VNI":1,"VtepMAC":"8a:6c:77:13:e4:b9"}, BackendV6Data: (nil)
Dec 13 00:59:50 ip-172-31-15-89 k3s[1585]: time="2024-12-13T00:59:50Z" level=info msg="Removing server from load balancer k3s-agent-load-balancer: 3.3.3.44:6443"
Dec 13 00:59:50 ip-172-31-15-89 k3s[1585]: time="2024-12-13T00:59:50Z" level=info msg="Updated load balancer k3s-agent-load-balancer server addresses -> [1.1.1.8:6443 2.2.2.215:6443] [default: 1.1.1.8:6443]"
Dec 13 00:59:50 ip-172-31-15-89 k3s[1585]: time="2024-12-13T00:59:50Z" level=info msg="Stopped tunnel to 3.3.3.44:6443"
Dec 13 00:59:50 ip-172-31-15-89 k3s[1585]: time="2024-12-13T00:59:50Z" level=info msg="Proxy done" err="context canceled" url="wss://3.3.3.44:6443/v1-k3s/connect"
Dec 13 00:59:55 ip-172-31-15-89 k3s[1585]: time="2024-12-13T00:59:55Z" level=info msg="Adding server to load balancer k3s-agent-load-balancer: 4.4.4.32:6443"
Dec 13 00:59:55 ip-172-31-15-89 k3s[1585]: time="2024-12-13T00:59:55Z" level=info msg="Updated load balancer k3s-agent-load-balancer server addresses -> [1.1.1.8:6443 2.2.2.215:6443 4.4.4.32:6443] [default: 1.1.1.8:6443]"
Dec 13 00:59:55 ip-172-31-15-89 k3s[1585]: time="2024-12-13T00:59:55Z" level=info msg="Started tunnel to 4.4.4.32:6443"

@github-project-automation github-project-automation bot moved this from To Test to Done Issue in K3s Development Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement An improvement to existing functionality
Projects
Status: Done Issue
Development

No branches or pull requests

2 participants