...
When the PowerEdge M1000e blade enclosure that houses your blade servers encounters a problem, an error message will display on the LCD screen on the front of the chassis or the Chassis Management Controller (CMC) System Event Logs (SEL).The following tables show possible error messages and their causes so that you can fix the error and clear the message. The Events and Error Message Reference Guide for Dell EMC PowerEdge servers provides information about all the events and error messages generated by the system firmware and agents that monitor system components. CMC Status Screen error messages Severity Error message Cause Critical CMC Battery: Battery sensor for CMC, failed was asserted CMC CMOS battery is missing or has no voltage. Critical CMC CPU Temp: Temperature sensor for CMC, failure event CMC CPU temperature has exceeded the critical threshold. Critical CMC Ambient Temp: Temperature sensor for CMC, failure event CMC ambient temperature has exceeded the critical threshold. Enclosure/Chassis Status Screen error messages Severity Error message Cause Critical Chassis Fan presence: Fan sensor for Chassis Fan, device removed was asserted The removed fan is required for proper cooling of the enclosure/chassis. Critical Power Supply Redundancy: PS Redundancy sensor for Power Supply, redundancy lost was asserted One or more power supply units (PSUs) has failed or been removed so the system is no longer redundant. Critical Power Supply Redundancy: PS Redundancy sensor for Power Supply, non-redundant: insufficient resources One or more PSU has failed or been removed and the system lacks enough power to maintain normal operations. Servers could power down. Critical Control Panel Temp: Temperature sensor for Control Panel, failure event The Chassis/Enclosure temperature exceeded the critical threshold. Critical CMCStand-alone: Micro Controller sensor for CMC, non-redundant was asserted The CMC is no longer redundant. This message will only show if the standby CMC was removed or failed. Critical Chassis Eventlog CEL: Event Log sensor for Chassis Eventlog, all event logging disabled was asserted The CMC cannot log events when the Event Log sensor is disabled. The Event Log is disabled when it becomes full. Clearing the log re-enables event logging. Critical Chassis Eventlog CEL: Event Log sensor for Chassis Eventlog, log full was asserted The chassis device detects that only one entry can be added to the CEL before it is full. Warning Chassis Eventlog CEL: Event Log sensor for Chassis Eventlog, log almost full was asserted The chassis event log is 75% full. Warning Power Supply Redundancy: PS Redundancy sensor for Power Supply, redundancy degraded was asserted One or more PSU has failed or been removed and the system can no longer support full PSU redundancy. Fan Status Screen error messages Severity Error message Cause Critical Chassis Fan Status: Fan sensor for Chassis Fan, failure event The speed of the specified fan is not sufficient to provide enough cooling to the system. IOM Status Screen error messages Severity Error message Cause Critical I/O Module Status: Module sensor for I/O Module, transition to critical from less severe was asserted The I/O module has a fault. The same error can also happen if the I/O module is thermal-tripped. Warning I/O Module Status: Module sensor for I/O Module, transition to non-critical from OK was asserted The IO module has a fabric mismatch or a link tuning mismatch. iKVM Status Screen error messages Severity Error message Cause Non-Recoverable Local KVM Health: Module sensor for Local KVM, transition to non-recoverable was asserted The Serial RIP or USB host chip has failed. Critical Local KVM Health: Module sensor for Local KVM, transition to critical from less severe was asserted The USB host enumeration or OSCAR failure has failed. Warning Local KVM Health: Module sensor for Local KVM, transition to non-critical from OK was asserted There has been a minor failure, such as corrupted firmware. PSU Status Screen error messages Severity Message Cause Critical Power Supply PSU : Power Supply sensor for Power Supply, failure was asserted The PSU has failed. Critical Power Supply PSU : Power Supply sensor for Power Supply, input lost was asserted The AC power cord has been unplugged or there has been a loss of AC power. Server Status Screen error messages for M1000e Blade servers Severity Error message Cause Warning System Board Ambient Temp: Temperature sensor for System Board, warning event The server ambient temperature crossed a warning threshold. Critical System Board Ambient Temp: Temperature sensor for System Board, failure event The server ambient temperature crossed a failing threshold. Critical System Board CMOS Battery: Battery sensor for System Board, failed was asserted The CMOS battery is not present or has no voltage. Warning System Board Current Monitor: Current sensor for System Board, warning event The current crossed a warning threshold. Critical System Board Current Monitor: Current sensor for System Board, failure event The current has crossed a failing threshold. Critical : Voltage sensor for System Board, state asserted was asserted The voltage is out of range. Critical CPU Status: Processor sensor for CPU The CPU has failed. Critical CPU Status: Processor sensor for CPU, thermal tripped was asserted The CPU has overheated. Critical CPU Status: Processor sensor for CPU The processor is the incorrect type or in the wrong location. Critical CPU Status: Processor sensor for CPU, presence was deasserted The required CPU is missing or not present. Critical System Board Video Riser: Module sensor for System Board, device removed was asserted The required module was removed. Critical Mezz B Status: Add-in Card sensor for Mezz B, install error was asserted The incorrect Mezzanine card is installed for I/O fabric. Critical Mezz C Status: Add-in Card sensor for Mezz C, install error was asserted The incorrect Mezzanine card was installed for I/O fabric. Critical Backplane Drive : Drive Slot sensor for Backplane, drive removed The storage drive was removed. Critical Backplane Drive : Drive Slot sensor for Backplane, drive fault was asserted The storage drive failed. Critical System Board PFault Fail Safe: Voltage sensor for System Board, state asserted was asserted This event is generated when the system board voltages are not at normal levels. Critical System Board OS Watchdog: Watchdog sensor for System Board, reboot was asserted The iDRAC watchdog detected that the system has crashed (the timer expired because no response was received from host) and the action is set to reboot. Critical System Board OS Watchdog: Watchdog sensor for System Board, power off was asserted The iDRAC watchdog detected that the system has crashed (the timer expired because no response was received from host) and the action is set to power off. Critical System Board OS Watchdog: Watchdog sensor for System Board, power cycle was asserted The iDRAC watchdog detected that the system has crashed (the timer expired because no response was received from Host) and the action is set to power cycle. Critical System Board SEL: Event Log sensor for System Board, log full was asserted The SEL device has detected that only one entry can be added to the SEL before it is full. Warning ECC Corr Err: Memory sensor, correctable ECC ( ) was asserted Correctable ECC errors have reached a critical rate. Critical ECC Uncorr Err: Memory sensor, uncorrectable ECC ( ) was asserted An uncorrectable ECC error was detected. Critical I/O Channel Chk: Critical Event sensor, I/O channel check NMI was asserted A critical interrupt has been generated in the I/O Channel. Critical PCI Parity Err: Critical Event sensor, PCI PERR was asserted A parity error was detected on the PCI bus. Critical PCI System Err: Critical Event sensor, PCI SERR ( ) was asserted The device detected a PCI error. Critical SBE Log Disabled: Event Log sensor, correctable memory error logging disabled was asserted Single-bit error logging is disabled when too many SBEs are logged. Critical Logging Disabled: Event Log sensor, all event logging disabled was asserted All error logging is disabled. Non-Recoverable CPU Protocol Err: Processor sensor, transition to non-recoverable was asserted The processor protocol has entered a nonrecoverable state. Non-Recoverable CPU Bus PERR: Processor sensor, transition to non-recoverable was asserted The processor bus PERR has entered a nonrecoverable state. Non-Recoverable CPU Init Err: Processor sensor, transition to non-recoverable was asserted The processor initialization has entered a nonrecoverable state. Non-Recoverable CPU Machine Chk: Processor sensor, transition to non-recoverable was asserted The processor machine check has entered a nonrecoverable state. Critical Memory Spared: Memory sensor, redundancy lost ( ) was asserted Memory spare is no longer redundant. Critical Memory Mirrored: Memory sensor, redundancy lost ( ) was asserted The mirrored memory is no longer redundant. Critical Memory RAID: Memory sensor, redundancy lost ( ) was asserted The RAID memory is no longer redundant. Critical Memory Cfg Err: Memory sensor, configuration error ( ) was asserted The memory configuration is incorrect for the system. Warning Mem Redun Gain: Memory sensor, redundancy degraded ( ) was asserted The memory redundancy is downgraded but not lost. Critical PCIE Fatal Err: Critical Event sensor, bus fatal error was asserted A fatal error is detected on the PCI bus. Critical Chipset Err: Critical Event sensor, PCI PERR was asserted A chip error is detected. Warning Mem ECC Warning: Memory sensor, transition to non-critical from OK ( ) was asserted Correctable ECC errors have surpassed a normal rate. Critical Mem ECC Warning: Memory sensor, transition to critical from less severe ( ) was asserted Correctable ECC errors have reached a critical rate. Critical System Board POST Err: POST sensor for System Board, POST fatal error was asserted See the Dell PowerEdge M1000e EnclosureOwner’s Manual for additional error information on BIOS POST errors.
Dell EMC
Dell Technologies