Symptom
A Cisco IOS-XE device may see memory exhaustion in overall system memory within linux_iosd-image. Overtime as memory becomes exhausted, the following may be observed:
%PLATFORM_INFRA-5-IOS_INTR_OVER_LIMIT_HIGH_STIME: IOS thread blocked due to SYSTEM LEVEL ISSUE for Total: 68 msec, utime: 0 msec, stime: 68 msec.
Top memory allocators are: Process: linux_iosd-imag_rp_0. Tracekey: 1#b35800f3ce359ee4a26c7147c689a9ee Callsite ID: 3627966464 (diff_call: 1084223).
The callsite ID & tracekey will be dependent on version but linux_iosd-imag_rp_0 would show an excessively large diff_call value. Callsite tracing can be enabled against the callsite in question and be reviewed by TAC using:
debug platform software memory ios switch active r0 alloc backtrace start depth 10
And this can be turned off with:
debug platform software memory ios switch active r0 alloc backtrace
Conditions
This is observed when we see port manager (PM) insertion/removal events for a transceiver on a port.
%PLATFORM_PM-6-MODULE_REMOVED: SFP module with interface name Gi1/1/1 removed
%PLATFORM_PM-6-MODULE_INSERTED: SFP module inserted with interface name Gi1/1/1
Workaround
Any transceiver(s) that are generating insertion/removal events can potentially be removed to cease the PLATFORM_PM messages. This should also cease growth within memory.
Further Problem Description