master don't come back online after reboot in 99_post_install - etcd member pod not running #100

sreichar · 2019-09-09T12:32:09Z

[kni@worker-0 logs]$ cat 99_post_install-2019-09-07-190210.log

source common.sh
++++ dirname common.sh
+++ cd .
+++ pwd
++ SCRIPTDIR=/home/kni/install-scripts/OpenShift
+++ whoami
++ USER=kni
++ '[' -z '' ']'
++ '[' -f /home/kni/install-scripts/OpenShift/config_kni.sh ']'
++ echo 'Using CONFIG /home/kni/install-scripts/OpenShift/config_kni.sh'
Using CONFIG /home/kni/install-scripts/OpenShift/config_kni.sh
++ CONFIG=/home/kni/install-scripts/OpenShift/config_kni.sh
++ source /home/kni/install-scripts/OpenShift/config_kni.sh
+++ set +x
+++ export INT_IF=eno2
+++ INT_IF=eno2
+++ export PRO_IF=eno1
+++ PRO_IF=eno1
++ export LIBVIRT_DEFAULT_URI=qemu:///system
++ LIBVIRT_DEFAULT_URI=qemu:///system
++ '[' kni '!=' root -a /run/user/1000 == /run/user/0 ']'
++ sudo -n uptime
+++ awk -F= '/^VERSION_ID=/ { print $2 }' /etc/os-release
+++ cut -f1 -d.
+++ tr -d '"'
++ VER=8
+++ tr -d '"'
+++ awk -F= '/^ID=/ { print $2 }' /etc/os-release
++ [[ rhel != \r\h\e\l ]]
++ [[ 8 -ne 8 ]]
++ '[' 3940 = 0 ']'
export KUBECONFIG=ocp/auth/kubeconfig
KUBECONFIG=ocp/auth/kubeconfig
POSTINSTALL_ASSETS_DIR=./assets/post-install
IFCFG_INTERFACE=./assets/post-install/ifcfg-interface.template
IFCFG_BRIDGE=./assets/post-install/ifcfg-bridge.template
BREXT_FILE=./assets/post-install/99-brext-master.yaml
export bridge=brext
bridge=brext
create_bridge
echo 'Deploying Bridge brext...'
Deploying Bridge brext...
++ head -1
++ oc get node -o 'custom-columns=IP:.status.addresses[0].address' --no-headers
FIRST_MASTER=10.19.1.231
++ ssh -q -o StrictHostKeyChecking=no [email protected] 'ip r | grep default | grep -Po '''(?<=dev )(\S+)''''
export interface=eno2
interface=eno2
'[' eno2 == '' ']'
'[' eno2 '!=' brext ']'
echo 'Using interface eno2'
Using interface eno2
++ envsubst
++ base64 -w0
export interface_content=REVWSUNFPWVubzIKQlJJREdFPWJyZXh0Ck9OQk9PVD15ZXMKTk1fQ09OVFJPTExFRD15ZXMKQk9PVFBST1RPPW5vbmUK
interface_content=REVWSUNFPWVubzIKQlJJREdFPWJyZXh0Ck9OQk9PVD15ZXMKTk1fQ09OVFJPTExFRD15ZXMKQk9PVFBST1RPPW5vbmUK
++ envsubst
++ base64 -w0
export bridge_content=REVWSUNFPWJyZXh0Ck5BTUU9YnJleHQKVFlQRT1CcmlkZ2UKT05CT09UPXllcwpOTV9DT05UUk9MTEVEPXllcwpCT09UUFJPVE89ZGhjcApCUklER0lOR19PUFRTPXZsYW5fZmlsdGVyaW5nPTEKQlJJREdFX1ZMQU5TPSIxIHB2aWQgdW50YWdnZWQsMjAsMzAwLTQwMCB1bnRhZ2dlZCIK
bridge_content=REVWSUNFPWJyZXh0Ck5BTUU9YnJleHQKVFlQRT1CcmlkZ2UKT05CT09UPXllcwpOTV9DT05UUk9MTEVEPXllcwpCT09UUFJPVE89ZGhjcApCUklER0lOR19PUFRTPXZsYW5fZmlsdGVyaW5nPTEKQlJJREdFX1ZMQU5TPSIxIHB2aWQgdW50YWdnZWQsMjAsMzAwLTQwMCB1bnRhZ2dlZCIK
envsubst
echo 'Done creating bridge definition'
Done creating bridge definition
apply_mc
for node_type in master worker
oc patch --type=merge '--patch={"spec":{"paused":true}}' machineconfigpool/master
machineconfigpool.machineconfiguration.openshift.io/master patched
for node_type in master worker
oc patch --type=merge '--patch={"spec":{"paused":true}}' machineconfigpool/worker
machineconfigpool.machineconfiguration.openshift.io/worker patched
'[' '' '!=' '' ']'
for node_type in master worker
++ find ./assets/post-install -iname '*-master.yaml' -type f
test ./assets/post-install/99-brext-master.yaml
echo 'Applying machine configs...'
Applying machine configs...
oc create -f ./assets/post-install/99-brext-master.yaml
machineconfig.machineconfiguration.openshift.io/99-brext-master created
oc patch --type=merge '--patch={"spec":{"paused":false}}' machineconfigpool/master
machineconfigpool.machineconfiguration.openshift.io/master patched
echo 'Rebooting nodes...'
Rebooting nodes...
sleep 30
oc wait mcp/master --for condition=updated --timeout 600s
error: timed out waiting for the condition on machineconfigpools/master
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (Timeout): the server was unable to return a response in the time allotted, but may still be processing the request
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Unable to connect to the server: unexpected EOF
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
Error from server (NotFound): the server could not find the requested resource (get machineconfigpools.machineconfiguration.openshift.io master)
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
error: timed out waiting for the condition on machineconfigpools/master
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
error: timed out waiting for the condition on machineconfigpools/master
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
error: timed out waiting for the condition on machineconfigpools/master
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
error: timed out waiting for the condition on machineconfigpools/master
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
error: timed out waiting for the condition on machineconfigpools/master
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
error: timed out waiting for the condition on machineconfigpools/master
sleep 1
oc wait mcp/master --for condition=updated --timeout 600s
error: timed out waiting for the condition on machineconfigpools/master
.
.
.

e-minguez · 2019-09-09T12:55:14Z

I've observed in my environment that no host is even rebooted while applying the machine-configs:

$ oc get nodes
NAME                                         STATUS                     ROLES           AGE   VERSION
kni1-master-0.env.mydomain.example.com   Ready,SchedulingDisabled   master,worker   79m   v1.14.6+82219910a
kni1-master-1.env.mydomain.example.com   Ready                      master,worker   80m   v1.14.6+82219910a
kni1-master-2.env.mydomain.example.com   Ready                      master,worker   79m   v1.14.6+82219910a

$ for node in $(oc get nodes -o jsonpath="{.items[*].metadata.name}"); do echo -n ${node
}; ssh core@${node} uptime; done
kni1-master-0.env.mydomain.example.com 12:47:43 up  1:20,  0 users,  load average: 0.41, 0.33, 0.43
kni1-master-1.env.mydomain.example.com 12:47:51 up  1:21,  0 users,  load average: 0.42, 1.48, 1.38
kni1-master-2.env.mydomain.example.com 12:47:45 up  1:20,  0 users,  load average: 0.44, 0.70, 0.81

Digging up a bit, I've seen the machine-config-daemon pod running in the kni1-master-0 is complaining about the pod disruption budget for the etcd-quorum-guard

$ oc get pods -n openshift-machine-config-operator -o wide | grep kni1-master-0
etcd-quorum-guard-59f44bc47d-sxg8j           1/1     Running   0          77m   10.19.138.11   kni1-master-0.env.mydomain.example.com   <none>           <none>
machine-config-daemon-hgglp                  1/1     Running   0          78m   10.19.138.11   kni1-master-0.env.mydomain.example.com   <none>           <none>
machine-config-server-8sbx4                  1/1     Running   0          78m   10.19.138.11   kni1-master-0.env.mydomain.example.com   <none>           <none>

$ oc logs machine-config-daemon-hgglp
...
I0909 12:53:06.955223   13858 update.go:89] error when evicting pod "etcd-quorum-guard-59f44bc47d-sxg8j" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I0909 12:53:11.961294   13858 update.go:89] error when evicting pod "etcd-quorum-guard-59f44bc47d-sxg8j" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I0909 12:53:16.966611   13858 update.go:89] error when evicting pod "etcd-quorum-guard-59f44bc47d-sxg8j" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.

$ oc get pods -o wide | grep etcd-quorum-guard
etcd-quorum-guard-59f44bc47d-br7qq           0/1     Running   0          78m   10.19.138.12   kni1-master-1.env.mydomain.example.com   <none>           <none>
etcd-quorum-guard-59f44bc47d-sxg8j           1/1     Running   0          78m   10.19.138.11   kni1-master-0.env.mydomain.example.com   <none>           <none>
etcd-quorum-guard-59f44bc47d-xd854           1/1     Running   0          78m   10.19.138.13   kni1-master-2.env.mydomain.example.com   <none>           <none>

$ oc get events | grep etcd-quorum-guard-59f44bc47d-br7qq
79m         Normal    Scheduled           pod/etcd-quorum-guard-59f44bc47d-br7qq            Successfully assigned openshift-machine-config-operator/etcd-quorum-guard-59f44bc47d-br7qq to kni1-master-1.env.mydomain.example.com
79m         Normal    Pulled              pod/etcd-quorum-guard-59f44bc47d-br7qq            Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:dfd89e339168edd91af05acd0f575474212f09b97d2b02b235bd1c17e7ae4802" already present on machine
79m         Normal    Created             pod/etcd-quorum-guard-59f44bc47d-br7qq            Created container guard
79m         Normal    Started             pod/etcd-quorum-guard-59f44bc47d-br7qq            Started container guard
22s         Warning   Unhealthy           pod/etcd-quorum-guard-59f44bc47d-br7qq            Readiness probe failed:
79m         Normal    SuccessfulCreate    replicaset/etcd-quorum-guard-59f44bc47d           Created pod: etcd-quorum-guard-59f44bc47d-br7qq

$ oc logs etcd-quorum-guard-59f44bc47d-br7qq
$ 

$ oc get pods -A | grep -v -E 'Running|Complete'
NAMESPACE                                               NAME                                                                  READY   STATUS             RESTARTS   AGE
openshift-etcd                                          etcd-member-kni1-master-1.env.mydomain.example.com                1/2     CrashLoopBackOff   17         86m

$ oc get pods -n openshift-etcd
NAME                                                     READY   STATUS             RESTARTS   AGE
etcd-member-kni1-master-0.env.mydomain.example.com   2/2     Running            0          87m
etcd-member-kni1-master-1.env.mydomain.example.com   1/2     CrashLoopBackOff   17         87m
etcd-member-kni1-master-2.env.mydomain.example.com   2/2     Running            0          87m

$ oc logs etcd-member-kni1-master-1.env.mydomain.example.com -n openshift-etcd
Error from server (BadRequest): a container name must be specified for pod etcd-member-kni1-master-1.env.mydomain.example.com, choose one of: [etcd-member etcd-metrics] or one of the init containers: [discovery certs]

$ oc logs etcd-member-kni1-master-1.env.mydomain.example.com -c etcd-member -n openshift-etcd
/bin/sh: line 3: /run/etcd/environment: Permission denied

In the nodes:

$ for node in $(oc get nodes -o jsonpath="{.items[*].metadata.name}"); do ssh core@${nod
e} sudo cat /run/etcd/environment; ssh core@${node} sudo ls -lZ /run/etcd/environment; done

export ETCD_DISCOVERY_SRV=kni1.env.mydomain.example.com
ETCD_WILDCARD_DNS_NAME=*.kni1.env.mydomain.example.com
ETCD_IPV4_ADDRESS=10.19.138.11
ETCD_DNS_NAME=etcd-0.kni1.env.mydomain.example.com
-rw-r--r--. 1 root root system_u:object_r:container_var_run_t:s0 205 Sep  9 11:29 /run/etcd/environment

export ETCD_DISCOVERY_SRV=kni1.env.mydomain.example.com
ETCD_IPV4_ADDRESS=10.19.138.12
ETCD_DNS_NAME=etcd-1.kni1.env.mydomain.example.com
ETCD_WILDCARD_DNS_NAME=*.kni1.env.mydomain.example.com
-rw-r--r--. 1 root root system_u:object_r:container_var_run_t:s0 205 Sep  9 11:29 /run/etcd/environment

export ETCD_DISCOVERY_SRV=kni1.env.mydomain.example.com
ETCD_WILDCARD_DNS_NAME=*.kni1.env.mydomain.example.com
ETCD_IPV4_ADDRESS=10.19.138.13
ETCD_DNS_NAME=etcd-2.kni1.env.mydomain.example.com
-rw-r--r--. 1 root root system_u:object_r:container_var_run_t:s0 205 Sep  9 11:28 /run/etcd/environment

jparrill · 2019-09-09T13:06:38Z

Same here... I've been facing this error after the bridge patch been applied and then rebooted. The etcd member never comes up again.

The bad master-0:

[core@master-0 ~]$ ls -lahZ /run/etcd/environment
-rw-r--r--. 1 root root system_u:object_r:container_var_run_t:s0 183 Sep  9 11:46 /run/etcd/environment

[core@master-0 ~]$ getfacl /run/etcd/environment
getfacl: Removing leading '/' from absolute path names
# file: run/etcd/environment
# owner: root
# group: root
user::rw-
group::r--
other::r--

A good node (master-2):

[core@master-2 ~]$ ls -alhZ /run/etcd/environment
-rw-r--r--. 1 root root system_u:object_r:container_var_run_t:s0 183 Sep  9 11:32 /run/etcd/environment

[core@master-2 ~]$ getfacl /run/etcd/environment
getfacl: Removing leading '/' from absolute path names
# file: run/etcd/environment
# owner: root
# group: root
user::rw-
group::r--
other::r--

e-minguez · 2019-09-09T13:42:35Z

I've just moved the etcd-member static pod definition in the affected host (to simulate a oc delete but for the static pod) and it seems to fix it...

$ ssh [email protected] sudo mv /etc/kubernetes/manifests/etcd-member.yaml /root/
$ ssh [email protected] sudo mv /root/etcd-member.yaml /etc/kubernetes/manifests/etcd-member.yaml

$ oc get pods
NAME                                                     READY   STATUS    RESTARTS   AGE
etcd-member-kni1-master-0.env.mydomain.example.com   2/2     Running   2          129m
etcd-member-kni1-master-1.env.mydomain.example.com   2/2     Running   28         23m
etcd-member-kni1-master-2.env.mydomain.example.com   2/2     Running   0          129m

sreichar · 2019-09-09T14:04:41Z

@e-minguez That also worked for me.

Is this something we need to raise with OpenShift?

e-minguez · 2019-09-09T14:11:48Z

@e-minguez That also worked for me.

Is this something we need to raise with OpenShift?

I believe it would be nice... the thing is this issue title seems misleading, I believe the main issue here is the etcd-member pod not running even if the install seems to be finished successfully.

e-minguez · 2019-09-09T15:27:42Z

https://bugzilla.redhat.com/show_bug.cgi?id=1750433

hardys · 2019-09-11T16:56:39Z

I just tried rebooting a dev-scripts VM and cannot reproduce the same issue - could this be related to the other configuration changes made prior to the reboot?

e-minguez · 2019-09-11T17:08:19Z

See the bugzilla. It seems there is a weird issue under the hood.

sreichar changed the title ~~master don't come back online after reboot in 99_post_install~~ master don't come back online after reboot in 99_post_install - etcd member pod not running Sep 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

master don't come back online after reboot in 99_post_install - etcd member pod not running #100

master don't come back online after reboot in 99_post_install - etcd member pod not running #100

sreichar commented Sep 9, 2019

e-minguez commented Sep 9, 2019 •

edited

Loading

jparrill commented Sep 9, 2019

e-minguez commented Sep 9, 2019

sreichar commented Sep 9, 2019

e-minguez commented Sep 9, 2019

e-minguez commented Sep 9, 2019

hardys commented Sep 11, 2019

e-minguez commented Sep 11, 2019

master don't come back online after reboot in 99_post_install - etcd member pod not running #100

master don't come back online after reboot in 99_post_install - etcd member pod not running #100

Comments

sreichar commented Sep 9, 2019

e-minguez commented Sep 9, 2019 • edited Loading

jparrill commented Sep 9, 2019

e-minguez commented Sep 9, 2019

sreichar commented Sep 9, 2019

e-minguez commented Sep 9, 2019

e-minguez commented Sep 9, 2019

hardys commented Sep 11, 2019

e-minguez commented Sep 11, 2019

e-minguez commented Sep 9, 2019 •

edited

Loading