The response to my previous post has been unreal! The amount of tweets, ping backs, hits, linking, emails was quite amazing. I have to admit, I didn’t expect to see that much of interest in the subject, but thanks to everyone who participated in promoting this idea on twitter and the blogosphere.
In the second part of this series we will spice things up a bit and explore through the following video many aspects of the idea we’ve talked about. Following to that an important screenshots and some considerations you should be aware of before and after implementing this in your lab.
What you’ll see in this video:
1- Deploy a thin-provisioned vESX(i) VM from a template.
2- Check the required configuration parameter on the vESX to run nested VMs.
3- Customizing the new vESX server (assign password, set a static IP, put the DNS config ..etc)
4- Add the vESX to an existing HA Cluster.
5- Test the vMotion within the same cluster and across deferent clusters in the datacenter.
6- Add the required configuration parameter on the nested VM for enabling the FT.
7- Enable the FT and test the failover across deferent vESX servers.
A Quick note on the hardware used:
Technically speaking, only one server can be used in this whole setup. What is deferent in my setup (as you saw in the diagram) is that I use an external iSCSI array (the CLARiiON AX4) for hosting the vESX VMs. I just needed the flexibility to have them on an external storage to share them later on with other servers, but it is not a requirement. You can simply use the internal storage of the pESX server to host your vESX and you will still have everything you see in the video. As far as the shared storage for the vESX servers is concerned, you can use the Celerra VSA. In my case I use the Celerra only for the SRM labs to do the replication trick. Other than that, I use the OpenFiler as my shared storage for the nested VMs.
Two things to be set on the pESX hosts:
1 – Increase the number of ports on your pESX vSwitch to accommodate the increased number of connections required by your vESX VMs.
2 – Enable the “Promiscuous Mode” on the vSpwitch
The configuration parameters on the Virtual Machines are as follows:
1 – The virtual ESX (vESX) host: monitor_control.restrict_backdoor = TRUE
2 – The nested VM: replay.allowBTOnly = TRUE
The iSCSI vSwitch on the pESX host bound to vmnic1 (phisical NIC 2) and connected to the EMC CLARiiON AX4 iSCSI array
The vMotion internal vSwitch on the pESX host
The Fault Tolerance internal vSwitch on the pESX host
The thin-provisioned ESX(i) size on disk (475MB) + the memory swap:
The thin-provisioned ESX(i) size on disk(3.5GB) + the memory swap:
Other considerations and GOTCHAs:
1 – When enabling the FT, make sure you have your VMs powered off, even if they have eager zeroed disks, I used to get some errors when the VMs were powered on while enabling FT.
2 – Sometimes the network card order and numbering could be confusing. For example, in the vESX VM, you will have the NICs order starting from 1 to 10, but in the actual vESX network configuration tab, you will find the NICs starting from 0 and counting towards 9. This could be confusing when mapping your vnics to the pESX host vSwitches, like the VMotion internal switch, the FT internal switch and so forth. Just make sure you count the nics order accurately.
3 – Deploying vESX from templates may be cool and fast, but it could a bit challenging sometime in troubleshooting the network related issues. The reason behind that is the fact that all the network cards will have the same MAC address, or worst, the Port Groups like (VMotion) could have the same MAC address even if you completely remove the vnics and created brand new ones. The work around for that is to create the vESX template, and then remove all the NICs form it. When you deploy a new vESX, you can just add the new nics as you like, and by that you’ll have a new MAC addresses. Beside that, you may need to add a new VMkernel network for the VMotion, and then remove the old one. Of course you may be thinking that deploying a brand new vESX would be easier, you are right, but with the scripting everything could be automated. I will try to write a PS script to automate this network changes/settings and post it here later on.