...
The original Serviceguard for Linux 12.80.05 patch, released on February 09, 2023, contained a deadman.ko module that did not have the proper digital signature from HPE to allow it to be loaded when the server is enabled to use the UEFI Secure Boot feature. In such a case, after updating to 12.80.05, deadman does not load and as a result, Serviceguard commands will not work and the node cannot join the cluster. This can be observed as follows: After updating to 12.80.05 using the original tar ball dated February 09, 2023, the following is displayed: # rpm -qi serviceguard|egrep "^Name|^Version|^Build Date" Name : serviceguard Version : A.12.80.05 Build Date : Wed Jan 25 07:00:26 2023 # mokutil --sb-state SecureBoot enabled # mokutil --test-key $SGROOT/drivers/HPEDB2016.der /opt/cmcluster/drivers/HPEDB2016.der is already enrolled # strings /lib/modules/$(uname -r)/extra/deadman.ko | grep "srcversion=" srcversion=0BAFBA10843925E32EEF7BA # modinfo /lib/modules/$(uname -r)/extra/deadman.ko|grep ^sig # systemctl status SGSafetyTimer.service ● SGSafetyTimer.service - Load Safety timer for Serviceguard on Linux. Loaded: loaded (/usr/lib/systemd/system/SGSafetyTimer.service; enabled; vendor preset: disabled) Active: inactive (dead) since Thu 2023-03-30 11:47:04 EDT; 1h 36min ago Main PID: 1018 (code=exited, status=0/SUCCESS) Mar 30 11:47:02 sles15218 systemd[1]: Started Load Safety timer for Serviceguard on Linux.. Mar 30 11:47:03 sles15218 SGSafetyTimer_service[1018]: Loading deadman with platform info parameter:0. Mar 30 11:47:03 sles15218 SGSafetyTimer_service[1018]: modprobe: ERROR: could not insert "deadman": Operation not permitted Mar 30 11:47:04 sles15218 SGSafetyTimer_service[1018]: Loading deadman with platform info parameter:0. Mar 30 11:47:04 sles15218 SGSafetyTimer_service[1018]: insmod: ERROR: could not insert module deadman.ko: Operation not permitted Mar 30 11:47:04 sles15218 SGSafetyTimer_service[1018]: Deadman failed to install. # lsmod|grep deadman # To identify if this is occurring, the modinfo command run against the deadman.ko module shows no "signature" related fields and the modprobe command fails with "Operation not permitted" as shown above.
Any HPE server or Virtual Machine running HPE Serviceguard for Linux 12.80.05 released on February 09, 2023, when the server is also using Secure Boot mode on any supported Linux operating system. There are several ways to check to see if the affected version is running. If the server is not set up for Secure Boot mode, then the driver will load and function as expected; however, if Secure Boot is enabled later and the user properly enrolls the HPE keys, the driver will fail to load. To verify if the affected version of 12.80.05 was downloaded, compare the sha256sum checksums provided in the HPE Serviceguard for Linux 12.80.05 Release Notes to the downloaded tar file. See Table 6 page 83. If the checksums fail, the prior release of 12.80.05 with the issue is downloaded. Note : The table from the Release Notes is included in this CA as follows: A.12.80.05 Base 4badccf7c7ec22aad27709a93ee02796320db07ecb6e6b9dab8ca97248e52050 Advanced 0330f083af1e5fdeb380b1b4ce8271624a80b44e932740bb0a6e3f6f67b3ece5 Enterprise adf1469c7288da67da15dee5f09b0c1b8a25b910c6db2720748f20de19ccb8f7 Premium 7589e1be89fbac1cf4a73ec3d7496c2b6d805e15a2620392bf50d35c3a5cc184 Another way to check is to extract the patch tarball and then check the Build Date of the serviceguard core rpm component. For example: # cd /tmp # tar -xf DVD_HPE_SGLX_12.80.05_Enterprise_x86_Media_BB097-11015-patch-128005.tar # cd BB097-11015/ # rpm -qip SLES/SLES15/Serviceguard/x86_64/serviceguard-A.12.80.05-0.sles15.x86_64.rpm | grep "Build Date" Build Date : Wed Jan 25 07:00:26 2023 # If the Build Date displays January 25, 2023, the version of 12.80.05 that contains this issue is downloaded. Finally, check for "signature" fields in the actual driver by running a command similar to the following: # modinfo /lib/modules/$(uname -r)/extra/deadman.ko|grep ^sig # If the command returns no output then the driver is unsigned.
To resolve the issue, HPE has re-released HPE Serviceguard for Linux 12.80.05 with a new release date of March 21, 2023 on the support portal. The sha256sum checksums documented in the Release Notes will be good for the re-released version. The new rpm Build Date will display March 20, 2023. Go to https://myenterpriselicense.hpe.com and login with appropriate credentials to re-download the new 12.80.05 patch bundle first for either scenario below. Scenario 1 - If already have rebooted the node and the deadman driver is currently not loaded and the node is not in a halt detached state, simply force reinstall the Serviceguard core rpm as follows: Unpack the tar ball into /tmp or the directory of your choice (we will use /tmp for example). # cd /tmp Extract the March 21, 2023 tar ball # tar -xf /tmp/HPE_SGLX_12.80.05_Enterprise_x86_Media_BB097-11015-patch-128005.tar Locate the rpm in the extracted directory structure that matches your installed operating system and force install the rpm. # cd /tmp/BB097-11015/ # grep VERSION /etc/os-release VERSION="15-SP2" VERSION_ID="15.2" Verify correct Build Date and force reinstall the new Serviceguard 12.80.05 core rpm. There is no need to re-install other rpm"s in the patched tar ball. # rpm -qi -p ./SLES/SLES15/Serviceguard/x86_64/serviceguard-A.12.80.05-0.sles15.x86_64.rpm | grep "Build Date" Build Date : Mon Mar 20 10:28:36 2023 <== Fixed version # rpm -ivh --force ./SLES/SLES15/Serviceguard/x86_64/serviceguard-A.12.80.05-0.sles15.x86_64.rpm Note : The reinstall may take a while to complete. Be patient. This is because Serviceguard commands in the install scripts are not responding and take some time to timeout. Observe the output carefully. You should see the fixed deadman driver being loaded automatically by the install script. **NOTE that this message only indicates the new module was copied to the /lib/modules/$(uname -r)/extra/ directory. It does not indicate whether or not the patched deadman is actually loaded into memory. It is only loaded if there is no deadman module currently loaded at time of patching. ... Loaded deadman with CONFIG_HZ(value:250) and Root Disk /dev/sda2(size:18802016256). ... Verify that Serviceguard commands run as expected and " lsmod | grep deadman " shows deadman is loaded. Start the node in the cluster using cmrunnode or cmruncl as appropriate. Scenario 2 - If previously installed 12.80.05 using February 09, 2023 patch tarball, but the node is up and running packages which can occur if a Serviceguard update without rebooting the node has occurred. In this case the currently loaded deadman module is the old 12.80.00 or 12.80.03 deadman that did not get unloaded and replaced during the patch upgrade. It is IMPORTANT to note that the next reboot the module will fail to load and the node will not join the cluster. If the node is not running any workload it is easier to just cmhaltnode the node in the cluster, force install the rpm, then reboot the node. However, if application downtime is an issue this procedure can be followed to update the module without application downtime. Check if this is the case by comparing the srcversion of the loaded deadman to the srcversion of the deadman.ko in the /lib/modules/$(uname -r)/extra directory. For example: Loaded module # cat /sys/module/deadman/srcversion 9C05BC5E84B074FA4ACF99B Module on disk that will be loaded at next reboot # strings /lib/modules/$(uname -r)/extra/deadman.ko|grep srcversion= srcversion=0BAFBA10843925E32EEF7BA # modinfo /lib/modules/$(uname -r)/extra/deadman.ko|grep ^sig #Note returning no output indicates an unsigned module # Note that the versions are different. It is not so important what the strings actually are, the important thing is that they are different, indicating the loaded module is not the module that will be loaded at the next reboot. To fix this without halting the application you must perform the following on each node (similar to a rolling upgrade). Use Live Application Detach method to halt the node from the cluster without bringing down Serviceguard packaged applications. # cmhaltnode -d Unpack the tar ball into /tmp or the directory of choice (/tmp is used as an example below). # cd /tmp Extract the March 21, 2023 tar ball # tar -xf /tmp/HPE_SGLX_12.80.05_Enterprise_x86_Media_BB097-11015-patch-128005.tar Locate the rpm in the extracted directory structure that matches the installed operating system and force install the rpm. # cd /tmp/BB097-11015/ # grep VERSION /etc/os-release VERSION="15-SP2" VERSION_ID="15.2" Verify correct Build Date and force reinstall the new serviceguard 12.80.05 core rpm. There is no need to re-install other rpm"s in the patched tar ball. # rpm -qi -p ./SLES/SLES15/Serviceguard/x86_64/serviceguard-A.12.80.05-0.sles15.x86_64.rpm | grep "Build Date" Build Date : Mon Mar 20 10:28:36 2023 <== Fixed version # rpm -ivh --force ./SLES/SLES15/Serviceguard/x86_64/serviceguard-A.12.80.05-0.sles15.x86_64.rpm Note that the reinstall may take a while to complete, wait. This is because Serviceguard commands in the install scripts are not responding and take some time to timeout. Observe the output carefully. The fixed deadman driver being loaded automatically by the install script should be displayed. **NOTE that this message only indicates the new module was copied to the /lib/modules/$(uname -r)/extra/ directory. It does not indicate whether or not the patched deadman is actually loaded into memory. It is only loaded if there is no deadman module currently loaded at time of patching. ... Loaded deadman with CONFIG_HZ(value:250) and Root Disk /dev/sda2(size:18802016256). ... To actually unload the old deadman module and load the new correctly signed module after the rpm force reinstall, perform the following steps. Note that it is not guaranteed for all of these services to be running and enabled on every cluster but they will be on most clusters. Assuming that all are enabled, they must all be stopped in order to unload the currently loaded deadman. If there are any questions, contact HPE Support for additional information. # systemctl stop cmsnmpd # systemctl stop cmwbemd # systemctl stop jetty-sgmgr.service # systemctl stop hacl-cfg.socket # systemctl stop hacl-cfgudp.socket # modprobe -r deadman #This should report no errors # lsmod | grep deadman #This should return nothing if deadman is unloaded # modprobe -v deadman #This should return an indication that deadman was loaded # lsmod | grep deadman #This should now show the new deadman is loaded # systemctl start SGSafetyTimer.service #Not entirely necessary but good to reset the status of the systemd service # systemctl start hacl-cfgudp.socket # systemctl start hacl-cfg.socket # systemctl start jetty-sgmgr.service # systemctl start cmwbemd # systemctl start cmsnmpd Verify that Serviceguard commands run as expected and lsmod | grep deadman displays deadman is loaded. Rejoin the node to the cluster and resume Serviceguard monitoring and HA for packaged applications. # cmrunnode Scenario 3 - If previously installed 12.80.05 using February 09, 2023 patch tarball, but the node is up and running packages and Secure Boot is not enabled but you plan to enable it in the future.In this case you will want to force upgrade the Serviceguard core rpm before enabling Secure Boot and rebooting the node. Since enabling Secure Boot will require a reboot, the recommendation is to simply do the force reinstall of Serviceguard in a method similar to Scenario 1 above BEFORE rebooting the server. Halt the node or cluster using cmhaltnode or cmhaltcl. With the node halted from the cluster, unpack the tar ball into /tmp or the directory of your choice (we will use /tmp for example). # cd /tmp Extract the March 21, 2023 tar ball that you downloaded previously. # tar -xf /tmp/HPE_SGLX_12.80.05_Enterprise_x86_Media_BB097-11015-patch-128005.tar Locate the rpm in the extracted directory structure that matches your installed operating system and force install the rpm. # cd /tmp/BB097-11015/ # grep VERSION /etc/os-release VERSION="15-SP2" VERSION_ID="15.2" Verify correct Build Date and force reinstall the new Serviceguard 12.80.05 core rpm. There is no need to re-install other rpm"s in the patched tar ball. # rpm -qi -p ./SLES/SLES15/Serviceguard/x86_64/serviceguard-A.12.80.05-0.sles15.x86_64.rpm | grep "Build Date" Build Date : Mon Mar 20 10:28:36 2023 <== Fixed version # rpm -ivh --force ./SLES/SLES15/Serviceguard/x86_64/serviceguard-A.12.80.05-0.sles15.x86_64.rpm After the rpm force reinstall completes, proceed with the rest of the Secure Boot enabling instructions from the Serviceguard documentation. Upon reboot to enable Secure Boot, the new, correctly signed deadman module should be loaded and after the reboot Serviceguard commands should be working normally and the node can be re-joined to the cluster. If further assistance is needed with this issue, please contact HPE Customer Support Contact your local country HPE Customer Support or log a case via the HPE Support Center RECEIVE PROACTIVE UPDATES : Receive support alerts (such as Customer Advisories), as well as updates on drivers, software, firmware, and customer replaceable components, proactively in your e-mail through HPE Support Alerts. Sign up for Support Alerts at the following URL: HPE Email Preference Center NAVIGATION TIP: For hints on navigating HPE.com to locate the latest drivers, patches and other support software downloads, refer to the Navigation Tips document. SEARCH TIP: For hints on locating similar documents on HPE.com, refer to the Search Tips document.