Friday, December 16, 2011

The esxi 4.x vmk0 feature

The esxi 4.x vmk0 bit me once again.   I should have known better. I had almost fogotten about this. In this case, I wanted to change my vmotion vlan and IP address. I failed to notice vmotion was using vmk0. I changed the vlan and IP, and voila.... everything went grey. I lost access to my esx server, and could only get to it via a remote console.

  I then got hold of vmware tech support, and with them on the line managed to also blow away all my nfs storage... Nice.   They quickly got things back to normal... More or less. The DCUI still showed my mgmt network as being one of my NAS storage IP addresses. Discovered a little known command that fixed this. But it unfixes itself at the next reboot.... Alas, after consulting again with vmware, my only hope was to rebuild the server....

Fortunately this esxi host was in maintenance mode, so no VMs were impacted by any of this. Just a couple of long days for me.

So, I asked myself , how did doing something as simple as changeing a vmotion IP address cause all these problems???

The answer is there is a part of the esxi 4 hypervisor that believes vmk0 is the mgmt network. So, even if you tell it otherwise, it does not believe you. If vmk0  gets assigned to anything else other than the mgmt network, you are in trouble.

But, it is even worse than that. You might think you can fix this by deleting vmk0 and the mgmt network and re-creating the mgmt network as vmk0. That sounds like a good plan. But you would be wrong. As soon as you delete vmk0, vmware looks for the next vmk and is convinced that is the mgmt network, even though something else is using it. In my case, it happenned upon one of my NAS IP addresses.

How might this happen? You may need to re-create your mgmt network. If you do, you might as well resign yourself to re-creating the server. Or, you might be trying to get a new server to comply with a host profile. If you were unlucky enough to use an esx server as a host profile, guess what? vmk0 is not used by the service console, and is assigned to something else, like vmotion or NAS storage or.... You will not be able to get your esxi host ti comply with the profile.  In desparation, you apply the profile, which fixes the problem. Unfortunately, it also assigned vmk0 to something other than your mgmt network. Your only recourse at this point is, you guessed it... to rebuild your esxi host.

The moral of the story is if you have an esxi host, make sure you do not change an IP address if it is uing vmk0 unless that is your mgmt network. Double check your vSphere client with your DCUI to make sure they both agree, and if you can,  check your host file to make sure it is in agreement as well. If they all agree, you are golden! Otherwise, think long and hard about changing that IP address!.. And never ever apply a host profile to an esxi server. It is too easy to mess this one up.  DO use a host profile to check your compliance, but do not apply it.

No comments:

Post a Comment