-
Related issue: #1561 Fully shutdown then power on harvester node machine can’t get provisioned RKE2 cluster back to work
-
Related issue: #1428 rke2-coredns-rke2-coredns-autoscaler timeout
Environment Setup
- The network environment must have vlan network configured and also have DHCP server prepared on your testing vlan
Verification Step
- Prepare a 3 nodes harvester cluster (provo bare machine)
- Enable virtual network with harvester-mgmt
- Create vlan1 with id
1
- Import harvester from rancher and create cloud credential
- Provision a RKE2 cluster with vlan
1
- Wait for build up ready
- Shutdown harvester node 3
- Shutdown harvester node 2
- Shutdown harvester node 1
- Wait for 20 minutes
- Power on node 1, wait 10 seconds
- Power on node 2, wait 10 seconds
- Power on node 3
- Wait for harvester startup complete
- Wait for RKE2 cluster back to work
- Check node and VIP accessibility
- Check the rke2-coredns pod status
kubectl get pods --all-namespaces | grep rke2-coredns
Expected Results
-
RKE2 cluster on harvester
can recover
toActive
status -
Can access dashboard by VIP
-
Can access each node IP
-
rke2-coredns pods running correctly
kube-system helm-install-rke2-coredns-74nsk 0/1 Completed 0 176m
kube-system rke2-coredns-rke2-coredns-5679c85bbb-5qrmm 1/1 Running 1 175m
kube-system rke2-coredns-rke2-coredns-5679c85bbb-zxpf8 1/1 Running 1 147m
kube-system rke2-coredns-rke2-coredns-autoscaler-6889866896-l42m8 1/1 Running 1 175m