Symptom
Endpoints are being deleted constantly in the ACI fabric.
Conditions
After a flush event due to a TCN, or an EP that is continuously flapping, you'll see that multiple EP's will be deleted. If you check the EPM log, you will see the following failure:
cd /var/log/dme/log
cat epm-trace.txt
[2017 Apr 21 10:07:20.327664467:14407842:mcec_tl_cb_recv_msg:2098:t] Rcvd msg from peer: ver 3, opc 243720, rr_token 0x18ed5
[2017 Apr 21 10:07:20.327692904:14407843:epm_debug_dump_epm_ep_req:347:t]
EP req - SAP 65535
EP_OP UPD MAC: = 5254.0010.f73a, #IPs 1
IP : 10.70.255.31
VLAN : 90 EPG vnid : 16547722 BD vnid : 16547722 VRF vnid : 2818048
ifidx : 0 TEP ip/tun if idx : 0xac1f1059 sclass : 0
Flags : bounce-to-proxy,IP,MAC, <---------
EP Req TS : 04/21/2017 10:07:31.726458
[2017 Apr 21 10:07:20.327829890:14407847:epm_mcec_pre_process_ep_req:885:E] Failed to get tunnel if idx of remote ToR 172.31.16.89, err : no such pss key <--------------
[2017 Apr 21 10:07:20.327831178:14407848:epm_send_ep_del_ack_to_peer:1140:t] EP req for EP for which FD/BD/VRF/Tun doesn't exist, deleting EP from EP Db, if it exists
[2017 Apr 21 10:07:20.327832981:14407849:epm_process_ep_del:1929:t] Delete req rcvd for EP:
[2017 Apr 21 10:07:20.327835330:14407850:epm_debug_dump_epm_ep:385:t]
EP entry
MAC: = 5254.0010.f73a, #IPs 1
IP : 10.70.255.31
VLAN : 19 EPG vnid : 10045 BD vnid : 16547722 VRF vnid : 2818048
ifidx : 0x1a01b000 tun if idx : 0 vtep tun if idx : 0
sclass : 32775 ref cnt : 5
Flags : local,IP,MAC,sclass,timer,
Create TS : 04/20/2017 23:18:49.002779
Upd TS : 04/21/2017 10:07:31.711788
Workaround
Reboot the switch.
upgrade to 1.3(1)+ Release