Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] LoadBalancer Service TCP Connection Fails in Cilium Kube-Proxy Replacement Mode #4739

Open
MuNeNICK opened this issue Nov 17, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@MuNeNICK
Copy link

MuNeNICK commented Nov 17, 2024

Kube-OVN Version

v1.12.28

Cilium Version

1.16.3

Kubernetes Version

Client Version: v1.31.2+k3s1
Kustomize Version: v5.4.2
Server Version: v1.31.2+k3s1

Operation-system/Kernel Version

"Ubuntu 22.04.5 LTS"
5.15.0-125-generic

Description

Hello.

I want to use Kube-OVN's LoadBalancer in a Cilium Kube-Proxy Replacement environment.

The LoadBalancer type service I created as shown below is not working:

ubuntu@ubuntu:~/test-svc$ kubectl get pod -o wide
NAME                             READY   STATUS    RESTARTS   AGE     IP           NODE     NOMINATED NODE   READINESS GATES
lb-svc-podinfo-576656f7b-85hvd   1/1     Running   0          3m44s   10.16.0.11   ubuntu   <none>           <none>
podinfo-6bd97dfb99-rj4g6         1/1     Running   0          3m55s   10.16.0.9    ubuntu   <none>           <none>
podinfo-6bd97dfb99-t6rxd         1/1     Running   0          3m55s   10.16.0.10   ubuntu   <none>           <none>
ubuntu@ubuntu:~/test-svc$ kubectl get svc
NAME         TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)          AGE
kubernetes   ClusterIP      10.43.0.1     <none>         443/TCP          30m
podinfo      LoadBalancer   10.43.5.219   192.168.0.62   9898:31442/TCP   3m49s
ubuntu@ubuntu:~/test-svc$

Ping works, but curl doesn't:

C:\Users\mune0>ping 192.168.0.61

Pinging 192.168.0.61 with 32 bytes of data:
Reply from 192.168.0.61: bytes=32 time<1ms TTL=64
Reply from 192.168.0.61: bytes=32 time<1ms TTL=64
Reply from 192.168.0.61: bytes=32 time<1ms TTL=64
Reply from 192.168.0.61: bytes=32 time<1ms TTL=64

Ping statistics for 192.168.0.61:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 0ms, Average = 0ms

C:\Users\mune0>curl 192.168.0.61:9898
curl: (28) Failed to connect to 192.168.0.61 port 9898 after 21003 ms: Could not connect to server

Here are the tcpdump results on the Node:

ubuntu@ubuntu:~/test-svc$ sudo tcpdump 'port 9898'
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
16:41:09.992512 IP 192.168.0.126.65498 > 192.168.0.62.9898: Flags [S], seq 3667285225, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
16:41:09.992563 IP 192.168.0.126.65498 > 10.16.0.9.9898: Flags [S], seq 3667285225, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
16:41:10.992850 IP 192.168.0.126.65498 > 192.168.0.62.9898: Flags [S], seq 3667285225, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
16:41:10.992916 IP 192.168.0.126.65498 > 10.16.0.9.9898: Flags [S], seq 3667285225, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
16:41:12.993139 IP 192.168.0.126.65498 > 192.168.0.62.9898: Flags [S], seq 3667285225, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
16:41:12.993209 IP 192.168.0.126.65498 > 10.16.0.9.9898: Flags [S], seq 3667285225, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
16:41:16.993475 IP 192.168.0.126.65498 > 192.168.0.62.9898: Flags [S], seq 3667285225, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
16:41:16.993545 IP 192.168.0.126.65498 > 10.16.0.9.9898: Flags [S], seq 3667285225, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
16:41:24.993406 IP 192.168.0.126.65498 > 192.168.0.62.9898: Flags [S], seq 3667285225, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
16:41:24.993469 IP 192.168.0.126.65498 > 10.16.0.9.9898: Flags [S], seq 3667285225, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
^C
10 packets captured
10 packets received by filter
0 packets dropped by kernel
ubuntu@ubuntu:~/test-svc$

Here are the Cilium dbg results:

ubuntu@ubuntu:~/test-svc$ kubectl -n kube-system exec ds/cilium -- cilium-dbg service list
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
ID   Frontend              Service Type   Backend                            
1    10.43.0.1:443         ClusterIP      1 => 192.168.0.28:6443 (active)    
2    10.43.168.220:443     ClusterIP      1 => 192.168.0.28:4244 (active)    
3    10.43.0.10:53         ClusterIP      1 => 10.16.0.2:53 (active)         
4    10.43.0.10:9153       ClusterIP      1 => 10.16.0.2:9153 (active)       
5    10.43.228.178:6642    ClusterIP      1 => 192.168.0.28:6642 (active)    
6    10.43.114.93:10660    ClusterIP      1 => 192.168.0.28:10660 (active)   
7    10.43.180.141:6641    ClusterIP      1 => 192.168.0.28:6641 (active)    
8    10.43.140.229:6643    ClusterIP      1 => 192.168.0.28:6643 (active)    
9    10.43.217.128:10665   ClusterIP      1 => 192.168.0.28:10665 (active)   
10   10.43.32.223:10661    ClusterIP      1 => 192.168.0.28:10661 (active)   
11   10.43.182.198:8080    ClusterIP      1 => 10.16.0.3:8080 (active)       
12   10.43.159.239:9898    ClusterIP      1 => 10.0.1.3:9898 (active)        
                                          2 => 10.0.1.2:9898 (active)        
13   192.168.0.28:31609    NodePort       1 => 10.0.1.3:9898 (active)        
                                          2 => 10.0.1.2:9898 (active)        
14   0.0.0.0:31609         NodePort       1 => 10.0.1.3:9898 (active)        
                                          2 => 10.0.1.2:9898 (active)        
15   192.168.0.61:9898     LoadBalancer   1 => 10.0.1.3:9898 (active)        
                                          2 => 10.0.1.2:9898 (active)        
19   10.43.92.92:80        ClusterIP      1 => 10.16.0.7:4245 (active)       
20   10.43.112.220:80      ClusterIP      1 => 10.16.0.8:8081 (active)       
21   10.43.5.219:9898      ClusterIP      1 => 10.16.0.10:9898 (active)      
                                          2 => 10.16.0.9:9898 (active)       
22   192.168.0.28:31442    NodePort       1 => 10.16.0.10:9898 (active)      
                                          2 => 10.16.0.9:9898 (active)       
23   0.0.0.0:31442         NodePort       1 => 10.16.0.10:9898 (active)      
                                          2 => 10.16.0.9:9898 (active)       
24   192.168.0.62:9898     LoadBalancer   1 => 10.16.0.10:9898 (active)      
                                          2 => 10.16.0.9:9898 (active)                                   

Steps To Reproduce

You can expand the following to check the environment setup commands.

Enviroment Setup Commands Here

k3s

export INSTALL_K3S_VERSION=v1.31.2+k3s1
curl -sfL https://get.k3s.io | sh -s - \
  --disable=servicelb \
  --disable=traefik \
  --disable=metrics-server \
  --flannel-backend=none \
  --disable-kube-proxy \
  --disable-network-policy \
  --disable-helm-controller \
  --disable-cloud-controller \
  --write-kubeconfig-mode 644 \
  --write-kubeconfig ~/.kube/config 

cilium

cat << 'EOF' | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: cni-configuration
  namespace: kube-system
data:
  cni-config: |-
    {
      "name": "generic-veth",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "kube-ovn",
          "server_socket": "/run/openvswitch/kube-ovn-daemon.sock",
          "ipam": {
              "type": "kube-ovn",
              "server_socket": "/run/openvswitch/kube-ovn-daemon.sock"
          }
        },
        {
          "type": "portmap",
          "snat": true,
          "capabilities": {"portMappings": true}
        },
        {
          "type": "cilium-cni"
        }
      ]
    }
EOF
helm repo add cilium https://helm.cilium.io/
helm install cilium cilium/cilium \
    --namespace kube-system \
    --set operator.replicas=1 \
    --set k8sServiceHost=${SERVER_IP} \
    --set k8sServicePort=6443 \
    --set kubeProxyReplacement=true \
    --set operator.replicas=1 \
    --set socketLB.enabled=true \
    --set nodePort.enabled=true \
    --set externalIPs.enabled=true \
    --set hostPort.enabled=false \
    --set routingMode=native \
    --set sessionAffinity=true \
    --set enableIPv4Masquerade=false \
    --set enableIPv6Masquerade=false \
    --set hubble.enabled=true \
    --set sctp.enabled=true \
    --set ipv4.enabled=true \
    --set ipv6.enabled=false \
    --set ipam.mode=cluster-pool \
    --set-json ipam.operator.clusterPoolIPv4PodCIDRList='["100.65.0.0/16"]' \
    --set-json ipam.operator.clusterPoolIPv6PodCIDRList='["fd00:100:65::/112"]' \
    --set cni.chainingMode=generic-veth \
    --set cni.chainingTarget=kube-ovn \
    --set cni.customConf=true \
    --set cni.configMap=cni-configuration

kube-ovn

kubectl label node -lbeta.kubernetes.io/os=linux kubernetes.io/os=linux --overwrite
kubectl label node -lnode-role.kubernetes.io/control-plane kube-ovn/role=master --overwrite
kubectl label node -lovn.kubernetes.io/ovs_dp_type!=userspace ovn.kubernetes.io/ovs_dp_type=kernel --overwrite

helm repo add kubeovn https://kubeovn.github.io/kube-ovn/
helm install kube-ovn kubeovn/kube-ovn \
  --set MASTER_NODES=${SERVER_IP} \
  --set func.ENABLE_NP=false \
  --set func.ENABLE_LB_SVC=true \
  --set func.ENABLE_TPROXY=true \
  --set cni_conf.CNI_CONFIG_PRIORITY=10

multus

kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/multus-cni/master/deployments/multus-daemonset.yml

external-network.yaml

apiVersion: kubeovn.io/v1
kind: Subnet
metadata:
  name: ovn-vpc-external-network
spec:
  protocol: IPv4
  provider: ovn-vpc-external-network.kube-system
  cidrBlock: 192.168.0.0/24
  gateway: 192.168.0.1 # IP address of the physical gateway
  excludeIps:
  - 192.168.0.1..192.168.0.60
  - 192.168.0.80..192.168.0.254
---
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: ovn-vpc-external-network
  namespace: kube-system
spec:
  config: '{
      "cniVersion": "0.3.0",
      "type": "macvlan",
      "master": "eth0",
      "mode": "bridge",
      "ipam": {
        "type": "kube-ovn",
        "server_socket": "/run/openvswitch/kube-ovn-daemon.sock",
        "provider": "ovn-vpc-external-network.kube-system"
      }
    }' 

deploymant.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: podinfo
  # namespace: vpc1
  labels:
    app: podinfo
spec:
  replicas: 2
  selector:
    matchLabels:
      app: podinfo
  template:
    metadata:
      labels:
        app: podinfo
    spec:
      containers:
      - name: podinfo
        image: ghcr.io/stefanprodan/podinfo:6.1.5
        ports:
        - containerPort: 9898
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace

svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: podinfo
  # namespace: vpc1
  annotations:
     lb-svc-attachment.kube-system.kubernetes.io/logical_switch: ovn-vpc-external-network
     ovn.kubernetes.io/attachmentprovider: ovn-vpc-external-network.kube-system
  labels:
    app: podinfo
spec:
  type: LoadBalancer
  ports:
  - port: 9898
    targetPort: 9898
    protocol: TCP
  selector:
    app: podinfo

Current Behavior

The current system behavior shows several issues:

  1. ICMP (ping) works to the LoadBalancer IP (192.168.0.62), but TCP connection (curl) fails:
  • Ping to 192.168.0.62 succeeds
  • Curl to 192.168.0.62:9898 times out after 21 seconds
  1. From tcpdump analysis:
  • SYN packets are being sent to both LoadBalancer IP (192.168.0.62:9898)
  • Traffic is being forwarded to the backend pod IP (10.16.0.9:9898)
  • No responses (SYN-ACK) are observed from either destination
  1. Cilium service list shows:
  • LoadBalancer service is properly configured (ID 24)
  • Frontend IP is 192.168.0.62:9898
  • Two active backends: 10.16.0.10:9898 and 10.16.0.9:9898
  • NodePort services are also configured on port 31442

Expected Behavior

  1. LoadBalancer Service should be fully functional:
  • TCP traffic to LoadBalancer IP (192.168.0.62:9898) should successfully reach the backend pods
  • Curl requests to 192.168.0.62:9898 should receive responses from the podinfo service
  • Service should properly load balance traffic between both backend pods (10.16.0.9 and 10.16.0.10)
  1. Network flow should work as follows:
  • Client sends request to LoadBalancer IP
  • Request is properly routed through Cilium's load balancing
  • Backend pod receives the traffic and responds
  • Response packets successfully return to the client
  1. Both Layer 3 (IP) and Layer 4 (TCP) connectivity should work:
  • Ping (ICMP) continues to work as it currently does
  • TCP connections should establish successfully
  • Full TCP handshake (SYN, SYN-ACK, ACK) should complete
  1. The LoadBalancer service should provide the same functionality as if using the default kube-proxy, despite running in Cilium's kube-proxy replacement mode.
@MuNeNICK MuNeNICK added the bug Something isn't working label Nov 17, 2024
Copy link

dosubot bot commented Nov 17, 2024

Hey there, @MuNeNICK! I'm here to help you with any issues or questions you have. Let's work together to solve this bug!

The issue you're experiencing with the LoadBalancer service in a Cilium Kube-Proxy Replacement environment might be related to the integration between Kube-OVN and Cilium, especially in handling TCP connections. Here are some steps and considerations to help troubleshoot and potentially resolve the problem:

  1. Check Kube-OVN and Cilium Compatibility: Ensure that the versions of Kube-OVN and Cilium you are using are compatible, especially with the kube-proxy replacement feature. There might be specific configurations or annotations required for them to work together, which are not detailed in the available documentation.

  2. Review LoadBalancer Configuration: Double-check the annotations and configurations in your svc.yaml to ensure they are correctly set for the environment. The annotations related to logical switch and attachment provider should match the network setup.

  3. Inspect Network Policies: If there are any network policies applied, ensure they allow traffic between the LoadBalancer IP and the backend pods. Network policies might inadvertently block the TCP traffic.

  4. Use Kube-OVN Debugging Tools: Utilize the debugging tools provided by Kube-OVN to trace the network flow and identify where the packets might be getting dropped. You can use kubectl-ko commands to gather more insights:

    • kubectl ko trace <pod> <args...> to trace packets through the network.
    • kubectl ko log kube-ovn all to collect logs from all Kube-OVN components [1][2].
  5. Check Cilium Logs and Debugging: Use Cilium's debugging tools to ensure that the service is correctly configured and that there are no issues with the load balancing or packet forwarding.

  6. Verify DNAT Rules: Ensure that the DNAT rules are correctly set up in Kube-OVN for the LoadBalancer service. The rules should properly translate the external IP to the internal pod IPs [3].

If these steps do not resolve the issue, further investigation into the specific integration between Kube-OVN and Cilium might be necessary, as there could be undocumented compatibility issues or additional configuration requirements.

To continue talking to Dosu, mention @dosu.

@MuNeNICK
Copy link
Author

The LoadBalancer worked correctly in the integrated Kube-OVN and Cilium environment when not using Kube-Proxy Replacement mode, as shown in the following procedure.

https://memos.munenick.me/m/fvXeKkZqADtD9HxSctBMDN#comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant