Skip to content

Commit

Permalink
salt: Handle duplicates in cri.wait_pod
Browse files Browse the repository at this point in the history
It usually happens that when kubelet replaces a static pod, there will
be a time when two instances of this pod coexist in CRI representation.
  • Loading branch information
gdemonet committed Jul 26, 2022
1 parent 8ba94d7 commit fe3c819
Show file tree
Hide file tree
Showing 4 changed files with 31 additions and 8 deletions.
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
# CHANGELOG
## Release 123.0.1 (in development)

### Bug fixes

- [#3827](https://github.com/scality/metalk8s/issues/3827)
Handle an issue with duplicate pods in CRI during a static pod update,
preventing upgrades to 123.0.0 when using an inconsistent registry HA setup
(PR[#3828](https://github.com/scality/metalk8s/pull/3828))

## Release 123.0.0

### Additions
Expand Down
9 changes: 7 additions & 2 deletions salt/_modules/cri.py
Original file line number Diff line number Diff line change
Expand Up @@ -340,8 +340,13 @@ def wait_pod(
start_time = time.time()

while time.time() - start_time < timeout:
current_id = get_pod_id(name=name, state=state, ignore_not_found=True)
if current_id and current_id != last_id:
current_ids = get_pod_id(
name=name,
state=state,
ignore_not_found=True,
multiple=True, # We may have two during a replacement
)
if current_ids and last_id not in current_ids:
return True
remaining = timeout + start_time - time.time()
if remaining < sleep: # Don't sleep if we know it's going to time out
Expand Down
22 changes: 16 additions & 6 deletions salt/tests/unit/modules/files/test_cri.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -129,17 +129,17 @@ wait_pod:
pod_ids:
- null
- null
- abc123
- [abc123]
result: True
# 1. Pod was updated
# 1. Pod was updated (simple delete then create)
- name: example
timeout: 5
sleep: 1
last_id: abc123
pod_ids:
- abc123
- [abc123]
- null
- def456
- [def456]
result: True
# 2. Some crictl error (raise)
- name: example
Expand All @@ -152,7 +152,7 @@ wait_pod:
sleep: 1
last_id: abc123
pod_ids:
- abc123
- [abc123]
- null
- null
raises: True
Expand All @@ -174,7 +174,17 @@ wait_pod:
last_id: abc123
raise_on_timeout: False
pod_ids:
- abc123
- [abc123]
- null
- null
result: False
# 6. Pod was updated (create then delete)
- name: example
timeout: 5
sleep: 1
last_id: abc123
pod_ids:
- [abc123]
- [abc123, def456]
- [def456]
result: True
1 change: 1 addition & 0 deletions salt/tests/unit/modules/test_cri.py
Original file line number Diff line number Diff line change
Expand Up @@ -374,6 +374,7 @@ def pod_ids_mock(*a, **k):
name=kwargs.get("name"),
state=kwargs.get("state", "ready"),
ignore_not_found=True,
multiple=True,
),
)
if pod_ids_raise:
Expand Down

0 comments on commit fe3c819

Please sign in to comment.