...
The System Health Check (SHC) test of "bbu_sensor_check" fails on both Nodes, reporting "Mismatched part number detected" [SVC:service@XXXXXXX-A user]$ svc_health_check run {code} Name Location Status Description Internal Disk Space node_a PASSED Internal Disk Space node_b PASSED CPU IERR Check node_a PASSED CPU IERR Check node_b PASSED ICD Network Connectivity Check node_a PASSED Test not needed for Block only (Single appliance) or HCI Drive Flags Check node_a PASSED Configuration Database node_a PASSED Configuration Database node_b PASSED Duplicate firmware entry Check node_a PASSED Duplicate firmware entry Check node_b PASSED Disk Health node_a PASSED DIMM Health node_a PASSED OS package name check node_a PASSED Active System Alerts Check node_a PASSED External Management Network node_a PASSED Symmetric ICM Connection Check node_a PASSED Time skew Check node_a PASSED I/O Ports node_a PASSED I/O Ports node_b PASSED recovery partition image check node_a PASSED recovery partition image check node_b PASSED REST Authorization Service node_a PASSED REST Authorization Service node_b PASSED Fru FW Upgrade Flag Check node_a PASSED Initiator Connectivity Redundancy node_a PASSED FSCK Leftover Check node_a PASSED FSCK Leftover Check node_b PASSED Pre-Upgrade Check node_a WARNING Pre-Upgrade Check node_b PASSED I/O Services Pre-Check node_a PASSED I/O Services Pre-Check node_b PASSED REST Configuration Service node_a PASSED REST Data Service node_a PASSED Internal System Services node_a PASSED Internal System Services node_b PASSED Component SN Check node_a PASSED Component SN Check node_b PASSED DIMM correctable errors check node_a PASSED DIMM correctable errors check node_b PASSED kms lockbox file check node_a PASSED Internal File System node_a PASSED Internal File System node_b PASSED DB Tmp Files Check node_a PASSED DB Tmp Files Check node_b PASSED Internal Management Network node_a WARNING SAS Expansion Enclosure Check node_a PASSED Operational Mode node_a PASSED Operational Mode node_b PASSED cyc_node space check node_a PASSED cyc_node space check node_b PASSED BBU Sensor Check node_a FAILED Mismatched part number detected on node A (KB#000196197) BBU Sensor Check node_b FAILED Mismatched part number detected on node B (KB#000196197) The user interface Hardware Component Internal View shows BBU Healthy for both nodes, but the BBU Part Number is different on each node.svc_diag hardware check on both Node A and Node B show BBUs are OK.Run the command against both nodes. $ svc_diag list --hardware --sub_options fault_status Hardware: ========== Fault Status register =========== Memory dimm00: OK | dimm01: OK | dimm02: OK | dimm03: OK | dimm04: OK | dimm05: OK | dimm06: OK | dimm07: OK | dimm08: OK | dimm09: OK | dimm10: OK | dimm11: OK | dimm12: OK | dimm13: OK | dimm14: OK | dimm15: OK | dimm16: OK | dimm17: OK | dimm18: OK | dimm19: OK | dimm20: OK | dimm21: OK | dimm22: OK | dimm23: OK | EmbeddedDrve Drive00: OK | Drive01: OK | BackEndDrive Drive00: OK | Drive01: OK | Drive02: OK | Drive03: OK | Drive04: OK | Drive05: OK | Drive06: OK | Drive07: OK | Drive08: OK | Drive09: OK | Drive10: OK | Drive11: OK | Drive12: OK | Drive13: OK | Drive14: OK | Drive15: OK | Drive16: OK | Drive17: OK | Drive18: OK | Drive19: OK | Drive20: OK | Drive21: OK | Drive22: OK | Drive23: OK | Drive24: OK | I/O Module iom00: OK | iom01: OK | Mezz mez00: OK | mez01: OK | PSU psu00: OK | psu01: OK | BBU bbu00: OK | bbu01: OK | FAN fan00: OK | fan01: OK | fan02: OK | fan03: OK | fan04: OK | fan05: OK | fan06: OK | fan07: OK | fan08: OK | fan09: OK | fan10: OK | fan11: OK | fan12: OK | fan13: OK | Root I2C i2c00: OK | i2c01: OK | i2c02: OK | i2c03: OK | i2c04: OK | i2c05: OK | i2c06: OK | i2c07: OK | IO I2C i2c00: OK | i2c01: OK | i2c02: OK | i2c03: OK | i2c04: OK | i2c05: OK | i2c06: OK | i2c07: OK | Other I2C i2c00: OK | i2c01: OK | i2c02: OK | i2c03: OK | i2c04: OK | i2c05: OK | i2c06: OK | i2c07: OK | Mezz I2C i2c00: OK | i2c01: OK | i2c02: OK | i2c03: OK | i2c04: OK | i2c05: OK | i2c06: OK | i2c07: OK | Drive I2C i2c00: OK | i2c01: OK | i2c02: OK | i2c03: OK | i2c04: OK | i2c05: OK | i2c06: OK | i2c07: OK | i2c08: OK | i2c09: OK | i2c10: OK | i2c11: OK | i2c12: OK | i2c13: OK | i2c14: OK | i2c15: OK | i2c16: OK | i2c17: OK | i2c18: OK | i2c19: OK | i2c20: OK | i2c21: OK | i2c22: OK | i2c23: OK | i2c24: OK | System Bit 0 CPU Module: OK Bit 1 Management Module: OK Bit 2 Drive I/O Card 0: OK Bit 3 eFlash : OK Bit 4 Expansion Bay 0: OK Bit 5 Enclosure: OK Bit 6 CMI: OK Bit 7 All Frus: OK Bit 8 External: OK Bit 9 Expansion Bay 1: OK Bit 10 Drive I/O Card 1: OK svc_diag shows different part numbers for BBU 0 and 1.Run the command against both nodes. $ svc_diag list --hardware --sub_options inventory |grep -A5 Battery FRU Device Description : Battery0 (ID 9) Board Mfg Date : Tue Aug 18 00:00:00 2020 Board Mfg : ACBEL POLYTECH INC. Board Product : LITHIUM-ION, UNIVERSAL BOB Board Serial : ACPW4211500063 Board Part Number : 078-000-177-02 -- FRU Device Description : Battery1 (ID 10) Board Mfg Date : Wed Dec 11 00:00:00 2019 Board Mfg : ACBEL POLYTECH INC. Board Product : LITHIUM-ION, UNIVERSAL BOB Board Serial : ACPV8200500140 Board Part Number : 078-000-168-02
There is a problem in the latest system health check thin packages for PowerStoreOS 2.1.x and 3.x which incorrectly checks the BBU part number between the two nodes and fails if they are not the same.All System Health Check packages since PowerStore-health_check-2.1.1.1-1736451 for PowerStoreOS 2.1.X are affected.All System Health Check packages since PowerStore-health_check-3.0.0.0-1781415 for PowerStoreOS 3.X are affected.PowerStore supports 11 different BBU part numbers and the system health check expects the PN of both BBUs to be the same, which may not be the case. Any of the below part numbers are valid and supported.078-000-168-02078-000-177-02078-000-192-00078-000-195-03078-000-211-01088-000-168-02088-000-177-02088-000-192-00088-000-195-03088-000-211-00088-000-211-01
If the BBUs are in a healthy state and the only error produced by running the system health check that thin package is "BBU Sensor Check node_x FAILED Mismatched part number detected on node X (KB#000196197)," the error can be ignored and the NDU can be started.This is fixed in a new release of the system health check thin packages under development.