Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add leader election retry #66

Merged
merged 37 commits into from
Feb 7, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
75b2fd8
fix leader election retry
samos123 Feb 3, 2024
9a049f1
increase sleep from 20 to 30
samos123 Feb 3, 2024
d8fdf7f
add single replica e2e tests
samos123 Feb 3, 2024
23f6904
add more descriptive name to e2e replica test
samos123 Feb 3, 2024
fd8725a
stream all logs of lingo
samos123 Feb 3, 2024
e365dd7
add retry to leader election process
samos123 Feb 3, 2024
4b2acd8
increase apiserver unavailability from 30s to 60s
samos123 Feb 3, 2024
c149556
recreate context if context deadline exceeded
samos123 Feb 4, 2024
1dbb678
ensure apiserver outage is 2 minutes
samos123 Feb 4, 2024
ba46c40
add log to indicate context deadline exceeded
samos123 Feb 4, 2024
3f1d48d
kubectl sometimes returns errors when apiserver went away for too long
samos123 Feb 4, 2024
7339da4
make wait for backoff blocking
samos123 Feb 4, 2024
228920b
fix logging in e2e test after apiserver went down
samos123 Feb 4, 2024
2405e3b
remove unneeded check for context deadline exceeds
samos123 Feb 4, 2024
83c81ab
wait for apiserver to be ready
samos123 Feb 4, 2024
c711ae6
Add more logs in e2e test
samos123 Feb 4, 2024
b8fc6b7
address PR comments
samos123 Feb 4, 2024
8236150
simplify tests and run in parallel
samos123 Feb 4, 2024
c63fd0e
improve GHA job names
samos123 Feb 5, 2024
e7cedfa
simplify leader election retry
samos123 Feb 5, 2024
19a89af
increase wait time for scale back to 0 in e2e
samos123 Feb 5, 2024
6ad6446
fix #67 flapping scale from 0 to 1 to 0 to 1
samos123 Feb 5, 2024
38ff2ad
add hostname to leader log messages
samos123 Feb 5, 2024
d4a3947
maybe this fixes #67
samos123 Feb 5, 2024
b056bce
fix PR comment, thanks Alex!
samos123 Feb 6, 2024
ba9d1e0
improve string formatting
samos123 Feb 6, 2024
985831e
sleep for 20 sec after apiserver outage
samos123 Feb 6, 2024
610ba31
wait wasn't long enough
samos123 Feb 6, 2024
b7bd4cf
revert fix for #67 because it breaks scale down to 0
samos123 Feb 6, 2024
8bea0bf
fix #67 only the leader should scale
samos123 Feb 6, 2024
9552e9e
simplify fix for #67 and unit tests
samos123 Feb 6, 2024
d720074
print lingo logs of all replicas on failure
samos123 Feb 6, 2024
fae4bab
in some cases state is incorrect so just scale to desired scale
samos123 Feb 6, 2024
af888e6
Revert "in some cases state is incorrect so just scale to desired scale"
samos123 Feb 6, 2024
89b8a6e
Revert "simplify fix for #67 and unit tests"
samos123 Feb 6, 2024
3f6bab5
Revert "fix #67 only the leader should scale"
samos123 Feb 6, 2024
7310411
remove broken test
samos123 Feb 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,11 @@ jobs:
run: make test-integration

e2e:
strategy:
matrix:
replicas: ["1", "3"]
runs-on: ubuntu-latest

name: E2E kind tests Lingo.replicas=${{ matrix.replicas }}
steps:
- name: Checkout code
uses: actions/checkout@v2
Expand All @@ -49,4 +52,4 @@ jobs:
sudo mv skaffold /usr/local/bin

- name: Run e2e tests
run: make test-e2e
run: REPLICAS=${{ matrix.replicas }} make test-e2e
21 changes: 20 additions & 1 deletion pkg/leader/election.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import (
"k8s.io/client-go/kubernetes"
"k8s.io/client-go/tools/leaderelection"
"k8s.io/client-go/tools/leaderelection/resourcelock"
"k8s.io/client-go/util/flowcontrol"
)

func NewElection(clientset kubernetes.Interface, id, namespace string) *Election {
Expand Down Expand Up @@ -63,5 +64,23 @@ type Election struct {
}

func (le *Election) Start(ctx context.Context) {
leaderelection.RunOrDie(ctx, le.config)
backoff := flowcontrol.NewBackOff(1*time.Second, 15*time.Second)
const backoffID = "lingo-leader-election"
retryCount := 0
for {
select {
case <-ctx.Done():
return
default:
if retryCount > 0 {
backoff.Next(backoffID, backoff.Clock.Now())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat, I didnt know about this library

delay := backoff.Get(backoffID)
log.Printf("Leader election failed, retrying in %v. RetryCount: %v", delay, retryCount+1)
<-time.After(delay)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time.After does not respect context cancellation. I put together what I think would be a better way to go about this loop here as an example: https://github.com/substratusai/lingo/compare/example-leader-election-loop ... I played around with this implementation and it exits immediately on context cancellation while still accomplishing expo-backoff (go run ./hack/retry ... ctrl-c). This is done by moving time.After into the select cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

originally I had this in a select but in favor of simplifying I removed it since I thought the higher level select was good enough. I will move it back in, makes senses, thanks for validating this with a quick experiment!

}
log.Printf("Starting leader election process. RetryCount: %v", retryCount+1)
leaderelection.RunOrDie(ctx, le.config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea that RunOrDie eventually exits if it loses connection to the API Server?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's exactly what seems to end up happening. This Kong PR has more details: Kong/kubernetes-ingress-controller#578

Copy link
Contributor Author

@samos123 samos123 Feb 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double confirmed this by checking the logs and seeing how often it had to retry (re-run RunOrDie) when the apiserver is down for ~2 minutes

retryCount++
}
}
samos123 marked this conversation as resolved.
Show resolved Hide resolved
}
24 changes: 21 additions & 3 deletions tests/e2e/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ set -xe
HOST=127.0.0.1
PORT=30080
BASE_URL="http://$HOST:$PORT/v1"
REPLICAS=${REPLICAS:-3}


if kind get clusters | grep -q substratus-test; then
Expand Down Expand Up @@ -42,6 +43,9 @@ if ! kubectl get deployment lingo; then
skaffold run
fi

kubectl patch deployment lingo --patch "{\"spec\": {\"replicas\": $REPLICAS}}"

kubectl logs -f deployment/lingo &

kubectl wait --for=condition=available --timeout=30s deployment/lingo

Expand Down Expand Up @@ -89,18 +93,32 @@ if [ "$replicas" -eq 1 ]; then
exit 1
fi

echo "Waiting for deployment to scale down back to 0 within 2 minutes"
# Verify that leader election works by forcing a 120 second apiserver outage
KIND_NODE=$(kind get nodes --name=substratus-test)
docker exec ${KIND_NODE} iptables -I INPUT -p tcp --dport 6443 -j DROP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 6443 the port of the API Server running on kind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep and credit to chatGPT with coming up with the high-level idea to simply block the traffic.

k describe svc kubernetes 
Name:              kubernetes
Namespace:         default
Labels:            component=apiserver
                   provider=kubernetes
Annotations:       <none>
Selector:          <none>
Type:              ClusterIP
IP Family Policy:  SingleStack
IP Families:       IPv4
IP:                10.96.0.1
IPs:               10.96.0.1
Port:              https  443/TCP
TargetPort:        6443/TCP
Endpoints:         172.18.0.2:6443
Session Affinity:  None
Events:            <none>

sleep 120
docker exec ${KIND_NODE} iptables -D INPUT -p tcp --dport 6443 -j DROP

until kubectl get pods; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is until kubectl get pods; do doing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so even after the iptable rule is removed it takes sometimes some time for kubernetes to recover. This until statement will wait until kubectl get pods start working again, meaning only if kubectl get pods returns exit code 0 it will continue.

echo "Waiting for apiserver to be back up"
sleep 1
done

# rerun kubectl logs because previous one got killed when apiserver was down
kubectl logs --tail=500 -f deployment/lingo &
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might want --tail=-1 if you are trying to get at all logs since restart.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to get all the logs which may already have 1000+ entries and only the last 500. I tried and last 100 wasn't enough but last 500 was more than enough.


echo "Waiting for deployment to scale down back to 0 within ~1 minute"
for i in {1..15}; do
if [ "$i" -eq 15 ]; then
echo "Test failed: Expected 0 replica after not having requests for more than 1 minute, got $replicas"
exit 1
fi
replicas=$(kubectl get deployment stapi-minilm-l6-v2 -o jsonpath='{.spec.replicas}')
replicas=$(kubectl get deployment stapi-minilm-l6-v2 -o jsonpath='{.spec.replicas}' || true)
if [ "$replicas" -eq 0 ]; then
echo "Test passed: Expected 0 replica after not having requests for more than 1 minute"
break
fi
sleep 8
sleep 6
done

echo "Patching stapi deployment to sleep on startup"
Expand Down
Loading