...
BugZero found this defect 331 days ago.
SummaryA deployed cluster starts showing multiple pod restarts and even causes the node to go into a not-ready state making the PFMP GUI inaccessible. ScenarioA PowerFlex Management Platform (PFMP) instance deployed on a customer-supplied infrastructure that has slow storage and does not meet storage requirements for PFMP (identified by the below symptoms) can cause stability issues with the K8's cluster or Postgres database members (e.g. frequent state changes) In a fresh deployment, In fresh deployment, no problems may be observed until PowerFlex Manager automation is triggered to deploy the PowerFlex cluster. Various pods may be restarted multiple times or go into a crash loop back-off (CLBO) state. Below are the symptoms that will be observed in a production environment.Run the below command to see the events in the cluster: kubectl get events kubectl describe node Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal RegisteredNode 36m node-controller Node pfmp-mvm-cl1-02 event: Registered Node pfmp-mvm-cl1-02 in Controller Normal NodeNotReady 35m node-controller Node pfmp-mvm-cl1-02 status is now: NodeNotReady Normal Starting 33m kubelet Starting kubelet. Warning InvalidDiskCapacity 33m kubelet invalid capacity 0 on image filesystem Normal NodeHasSufficientMemory 33m (x2 over 33m) kubelet Node pfmp-mvm-cl1-02 status is now: NodeHasSufficientMemory Normal NodeHasNoDiskPressure 33m (x2 over 33m) kubelet Node pfmp-mvm-cl1-02 status is now: NodeHasNoDiskPressure Normal NodeHasSufficientPID 33m (x2 over 33m) kubelet Node pfmp-mvm-cl1-02 status is now: NodeHasSufficientPID Normal NodeNotReady 33m kubelet Node pfmp-mvm-cl1-02 status is now: NodeNotReady Normal NodeAllocatableEnforced 33m kubelet Updated Node Allocatable limit across pods Normal NodeReady 33m kubelet Node pfmp-mvm-cl1-02 status is now: NodeReady Below is the Sample output that is failing for Kubelet. Logs can be found in the below path for each node: /var/lib/rancher/rke2/agent/logs/kubelet.log Aug 04 17:12:11 pfmp-mvm-cl1-02 rke2[31392]: time="2023-08-04T17:12:11+09:00" level=debug msg="Wrote ping" Aug 04 17:12:12 pfmp-mvm-cl1-02 rke2[31392]: E0804 17:12:12.654816 31392 leaderelection.go:367] Failed to update lock: etcdserver: request timed out Aug 04 17:12:13 pfmp-mvm-cl1-02 rke2[31392]: time="2023-08-04T17:12:13+09:00" level=debug msg="Wrote ping" Aug 04 17:12:14 pfmp-mvm-cl1-02 rke2[31392]: I0804 17:12:14.526253 31392 leaderelection.go:283] failed to renew lease kube-system/rke2: timed out waiting for the condition Aug 04 17:12:16 pfmp-mvm-cl1-02 rke2[31392]: time="2023-08-04T17:12:16+09:00" level=debug msg="Wrote ping" Aug 04 17:12:17 pfmp-mvm-cl1-02 systemd[1]: run-containerd-runc-k8s.io-1ee2f8a41e076afeb4f14eb53d7faea3b1b11c59ecc44aeed0c3ee1333b07a01-runc.DLNAIb.mount: Succeeded. Aug 04 17:12:23 pfmp-mvm-cl1-02 rke2[31392]: time="2023-08-04T17:12:23+09:00" level=debug msg="Wrote ping" Aug 04 17:12:26 pfmp-mvm-cl1-02 rke2[31392]: E0804 17:12:26.235602 31392 leaderelection.go:306] Failed to release lock: Operation cannot be fulfilled on configmaps "rke2": the object has been modified; please apply your changes to the latest version and try again Aug 04 17:12:26 pfmp-mvm-cl1-02 rke2[31392]: time="2023-08-04T17:12:26+09:00" level=fatal msg="leaderelection lost for rke2 Etcd loggings: Run the below kubectl command to find the pod for each Etcd and extract the logs. # kubectl get pods -n kube-system |grep etcd etcd-node1 1/1 Running 1 92d etcd-node2 1/1 Running 1 92d etcd-node3 1/1 Running 1 92d # kubectl logs --follow -n kube-system etcd-node1 >> etcd.txt Logging in the Etcd logs: 2023-08-04T17:12:12.223766355+09:00 stderr F {"level":"warn","ts":"2023-08-04T08:12:12.223Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"16.840314464s","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/operator.tigera.io/installations/\" range_end:\"/registry/operator.tigera.io/installations0\" count_only:true ","response":"","error":"etcdserver: request timed out"} Alternatively, the same logs can be found for each node in the below path as well: /var/log/pods/kube-system_etcd-_e18aa5e5b83a5a3c56d78e4054612394/etcd Impact This could lead to PFMP cluster node stability issues getting the pods to restart multiple times causing the GUI to be inaccessible.
The Etcd response time is sensitive to slow storage response. If the MVMs' underlying storage does not meet requirements, slow performance issues may be seen within the Etcd environment. Storage response times exceeding 1 s can also lead to various Kubelet process crashes. For example, storage consisted of hybrid drives (a mix of HDD and SSD) can lead to performance issues with the Etcd. To verify the Etcd performance, the below command can be run. for x in $(kubectl get pods -n kube-system |grep etcd |awk '{print $1}') ; do echo "------------------------"; echo $x; echo;kubectl exec -it -n kube-system $x -- etcdctl check perf --cacert="/var/lib/rancher/rke2/server/tls/etcd/server-ca.crt" --cert="/var/lib/rancher/rke2/server/tls/etcd/server-client.crt" --key="/var/lib/rancher/rke2/server/tls/etcd/server-client.key"; echo "------------------------"; echo; sleep 5; done Below is the Sample output for Etcd performance failure: for x in $(kubectl get pods -n kube-system |grep etcd |awk '{print $1}') ; do echo "------------------------"; echo $x; echo;kubectl exec -it -n kube-system $x -- etcdctl check perf --cacert="/var/lib/rancher/rke2/server/tls/etcd/server-ca.crt" --cert="/var/lib/rancher/rke2/server/tls/etcd/server-client.crt" --key="/var/lib/rancher/rke2/server/tls/etcd/server-client.key"; echo "------------------------"; echo; sleep 5; done ------------------------ etcd-sio-car-pfmp-mvm-01 60 / 60 Boooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo! 100.00% 1m0s PASS: Throughput is 144 writes/s Slowest request took too long: 1.282354s <<<<<<<<<<<<<<<<<<<<<<< Stddev too high: 0.151896s FAIL command terminated with exit code 1 ------------------------ ------------------------ etcd-sio-car-pfmp-mvm-02 60 / 60 Boooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo! 100.00% 1m0s PASS: Throughput is 150 writes/s Slowest request took too long: 0.517011s PASS: Stddev is 0.064465s FAIL command terminated with exit code 1 ------------------------ ------------------------ etcd-sio-car-pfmp-mvm-03 60 / 60 Boooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo! 100.00% 1m0s FAIL: Throughput is 138 writes/s Slowest request took too long: 2.505517s <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Stddev too high: 0.207719s FAIL command terminated with exit code 1 The etcd performance test will fail if the following parameters are not met: Measure Limit Throughput > 140 writesSlowest Request Standard Deviation
Resolution The issue is not in the PFMP version but the way the cluster VMs are deployed. The MVMs underlying storage should reside local to the node where these are deployed, and the drives should be at least SSD. Impacted Versions 4. X Fixed In Version None