58-Negative-Fully power cycle harvester node machine should recover RKE2 cluster

  • Related issue: #1561 Fully shutdown then power on harvester node machine can’t get provisioned RKE2 cluster back to work

  • Related issue: #1428 rke2-coredns-rke2-coredns-autoscaler timeout

Environment Setup

  • The network environment must have vlan network configured and also have DHCP server prepared on your testing vlan

Verification Step

  1. Prepare a 3 nodes harvester cluster (provo bare machine)
  2. Enable virtual network with harvester-mgmt
  3. Create vlan1 with id 1
  4. Import harvester from rancher and create cloud credential
  5. Provision a RKE2 cluster with vlan 1
  6. Wait for build up ready
  7. Shutdown harvester node 3
  8. Shutdown harvester node 2
  9. Shutdown harvester node 1
  10. Wait for 20 minutes
  11. Power on node 1, wait 10 seconds
  12. Power on node 2, wait 10 seconds
  13. Power on node 3
  14. Wait for harvester startup complete
  15. Wait for RKE2 cluster back to work
  16. Check node and VIP accessibility
  17. Check the rke2-coredns pod status kubectl get pods --all-namespaces | grep rke2-coredns

Expected Results

  1. RKE2 cluster on harvester can recover to Active status

  2. Can access dashboard by VIP

  3. Can access each node IP

  4. rke2-coredns pods running correctly

kube-system helm-install-rke2-coredns-74nsk 0/1 Completed 0 176m
kube-system rke2-coredns-rke2-coredns-5679c85bbb-5qrmm 1/1 Running 1 175m
kube-system rke2-coredns-rke2-coredns-5679c85bbb-zxpf8 1/1 Running 1 147m
kube-system rke2-coredns-rke2-coredns-autoscaler-6889866896-l42m8 1/1 Running 1 175m