I had 8 paths go down to a dead state on an ESXi host. The paths were MRU via Fiber Channel to a storage array. One path worked and it was configured as RR path.
I knew this wasn’t a physical issue, it had to be a software/configuration issue on my host because there were:
- No storage array errors
- Additional hosts in the cluster had no problems
- One path still worked from the HBA
Looking at the log (/var/log/vmkernel.log) I searched for one of the LUN identifiers, in my case “:L30” which was one of the dead paths. This yielded a result showing an error with NMP plugin driver invalid command.
Next step was to figure out and verify what NMP details were and compare against a working host.
esxcli storage nmp device list |grep "Path Selection Policy:" |sort |uniq -c
I saw nothing out of the ordinary.
Apparently the storage did not like the use of RR so I removed the SATP claim rule though, so I removed it:
esxcli storage nmp satp rule remove -V IBM -M "^1746*" -P VMW_PSP_RR -s VMW_SATP_ALUA
Storage paths are happy now.