I'd like to ask the community a question which I could not find any reference to while searching.
A customer of ours noticed that his NetApp disks on a certain filer changed their name from one path to another. The configuration involves a single node with a multipath cabling. The name switch happened somewhere around midnight, and we believe this was the result of a controller failure - but were not sure. The customer says there was no notice or alarm of this.
My question is simple: When does a disk path-switch occur in a single-node setup (not pair)?
Would it be only for a controller failure, or can it happen for other reasons (as load-balancing for example)?
This is not up to the controller usually, the OS, or should I say the multipath driver in the OS makes the call, because some path gets unresponsive or because manually triggered.
The reason for the path change might be somewhere in the OS logs, you might want to take a look.
Is this ALUA ?
It seems like crazysonar is saying the drive path name changed on the controller (via disk show, sysconfig, etc.), while Yann is talking about host side multipathing/path failover. Yann's observation is correct - in that the host multipath driver will switch primary paths for a number of reasons.
On the controller, I've seen the path change spontaneously for a spare/unowned drive, but I don't recall seeing it change on a drive in use - though to be honest I don't spend a lot of time watching the drive paths. I would expect that during a failure and reboot, paths could change - and I have seen a panic and recovery where none of the customers complained, so it's still a possibility. If this was the case, you'd see reboot messages in the controller logs.
Thanks for the replies!
The issue is indeed on the device, not host side. The customer representative claims there were no alarms or email notifications for this disk name change (stemming from a path change). We didn't however investigate the filer logs though - your suggesting there would be reboot messages that could indicate there was a controller failure inside the node just prior to the auto-reboot?
Could this disk path change occur from something else, like load-balancing, or anything other than controller failure?
Thank you, this really helps