Info
Any of the HPE ProLiant Gen10 Plus servers mentioned in the Scope section below may encounter a Server Critical Fault when powering on after conducting multiple "Soft Off" events. This is due to servers configured with multiple GPUs conducting several "Soft Off" events causing electrical components to enter a degraded state and preventing the server from powering on.
The following error message may be registered in the IML:
Server Critical Fault (Service Information: Power On Fault, System Board, AUX/Main EFUSE (10h))
Scope
Any of the following servers configured with 2*Dual Wide GPUs (300w or greater) or configured with 4*Single Wide GPUs (150w or greater):
HPE ProLiant DL345 Gen10 Plus server
HPE ProLiant DL380 Gen10 Plus server
HPE ProLiant DL385 Gen10 Plus server
HPE ProLiant DL385 Gen10 Plus v2 server
HPE ProLiant DL325 Gen10 Plus v2 server
HPE ProLiant DL325 Gen10 Plus server
Resolution
This issue is solved by updating the PIC firmware to version 1.1.4, the System Programmable Logic Device (CPLD) firmware and changing the BIOS/Platform Configuration (RBSU) > Automatic Power-On settings to "Always Power On".
The PIC - Advanced Power Capping Microcontroller Firmware for HPE Gen10 and
Gen10 Plus Server firmware version 1.1.4 is available
here
.
CPLD firmware - Contact
HPE
Support
and reference Doc ID a00138183en_us, to obtain the firmware and
installation instructions.
ProLiant DL380 Gen10 Plus: CPLD version v1717
ProLiant DL385 Gen10 Plus and ProLiant DL385 Gen10 Plus v2: CPLD
version v3131.
ProLiant DL345 Gen10 Plus, DL325 Gen10 Plus and DL325 Gen10 Plus v2:
CPLD version v1E1E.
Flash sequence:
Update first the CPLD firmware.
VERY IMPORTANT:
When installing the CPLD firmware, monitor the update status and make sure not to disrupt AC power before the CPLD flashing is complete. Once complete, CPLD will force a power cycle, then a reboot is needed, which can be done remotely.
Conduct an AC power cycle.
Flash the Power PIC firmware to version 1.1.4.
Reboot the system.
After the firmware update, change the Automatic Power-On setting to "Always Power ON" under System Configuration > BIOS/Platform Configuration (RBSU) > System Options > Server Availability > Automatic Power-On. If this is set to "Always Power Off" or "Restore Last Power State", the system may not power on by itself automatically.
Note: When the firmware updates are completed, the following false error may be observed when the fix is executed to prevent an actual failure. This false error message can be ignored.
Server Critical Fault (Service Information: Power On Fault, System Board, AUX/Main EFUSE (40h))
Revision History
Document Version
Release Date
Details
2
August 7, 2024
Added the ProLiant DL325 Gen10 Plus and ProLiant DL325 Gen10 Plus v2 to affected products.
1
March 11, 2024
Original Document Release.