vSphere In Motion: A Real-World Live Migration Scenario

Motivation

I was having a discussion with one of the large enterprises here in Qatar lately, and I was quite surprised to know from them that they are hesitated to migrate their VI3.5 environment to vSphere because of the associated downtime. What surprised me was not the fact that they can’t afford a downtime, I’ve spent 6 years of my career working in the Telecom sector and I know for a fact that 1 second of downtime could mean a disaster, or even translate to a loss of thousand of $$. What surprised me was that they didn’t know that it is possible to do this migration without any downtime!

In this blog post, I will not only show you (and them) how I was able to perform my upgrade without even this single second of downtime, but I will also show how we were able to migrate our storage from one array to another without any service interruption whatsoever in our equally critical environment. To make things even more exciting, what I’m about to show you here is completely achievable using vSphere’s built-in features like VMware Converter, EVC, vMotion and Storage vMotion. There was no third-party tools used in this entire migration.

A brief environment overview

There is nothing better than diagramming this for easier follow-up. In the diagram below I’m illustrating a small portion of the environment showing the main components of the old ESX 3.5 hosts as well as the ESX 4.0 hosts. In our case, we decided not to go with in-place upgrade, and preferred to have a fresh install for the ESX hosts in the new vSphere environment.

You might have noticed that I included a video inside the diagram, and probably wondering why on earth would someone do something like that? The answer is simple: I’m showing-off! No seriously, I know many people (from VMware and specific storage vendors) who use my diagrams in their internal meetings with customers (really I’m not showing-off), and I thought it would be nice to have such small clip in the diagram that shows both the vMotion & SvMotion easy point-and-click approach.

Note: This is just an illustration not an S/vMotion architecture diagram! Wait for my A3 if you are interested to see the technology behind this…magic!

The Process

Step 1: We are running here vCenter on a physical server, and we want to utilize the same hardware for the new upgrade. The easiest way to achieve that is to P2V the existing vCenter 2.5 to another standalone ESX host in our environment. After the VM is migrated successfully and all the clean-up is done, the switch over from the physical to virtual can happen in a matter of seconds by disconnecting the physical server from the network, and connecting the VM (which has the same IP address of course) to the same subnet.

Step 2: Now that we have the vCenter 2.5 migrated, the next step is to perform a clean install on the freed physical server. Starting with the OS deployment, all the way to the vCenter 4.0 installation, initial configuration and licensing.

Step 3: The third step is to connect the new vCenter 4.0 to the old vCenter 2.5 licensing server. This part is important because the ESX 3.5 hosts do not leverage the new and improved licensing model that was introduced in the 4.0 release. This step is quite easy: you go to the “Administration” menu on your vSphere client, select the “vCenter Server Settings”, and then enter your old vCenter 2.5 hostname into the field as shown in the example below.

Step 4: Now we are ready to create a new cluster for the existing ESX 3.5 hosts on the left side of the diagram. The thing to note here is to create the cluster with the EVC mode enabled as shown below because we will be migrating the VMs between two deferent hardware/CPU generations:

Step 5: We create here a second cluster (EVC enabled as well) and add the new ESX 4.0 hosts to it as shown in the right side of the diagram.

Step6: Now, the trick here is to have one ESX 4.0 host in this cluster connected to both arrays in the environment – the EVA and the V-Max. We achieve that by connecting one HBA to the HP SAN fabric, and the second HBA to the EMC SAN fabric. Once this is done, and all the associated zoning and masking is configured, we can scan the HBAs and have all the datastores/LUNs available on this server that we will call it “Gateway”.

Step7: The fun begins. Since the gateway server is having the same shared storage with the ESX 3.5 hosts, all what you need to do here is to drag and drop your VMs from the old cluster to the new one. The vMotion will kick-in and do it’s magic to live migrate the VMs to the new gateway server. That’s right! We are live migrating virtual machines from ESX 3.5 to ESX 4.0 on the fly.

Step 8: Now to my favorite part in the whole migration process. Here we get to experience one of the most amazing features in vSphere – the Storage vMotion. It has been actually re-written with significant performance improvements that made it one of the most powerful tools for any VMware administrator in my opinion, and the best part is that it’s done now with a few mouse clicks through the GUI (checkout the diagram video, or this detailed post). As I mentioned above, we were migrating our workloads from the HP EVA to the EMC V-Max, and we felt quite confident (after intensively testing this in the lab for a week) that the SvMotion would be the best choice for our storage migration. The other reason for using SvMotion was the ability to thin-provision VMs on the fly. I’m not talking here about everything of course, but rather the development VMs that are hardly ever touched. We had so many VMs for our development department with quite huge space requirements, while in fact they are neither actively used all the time, nor they consume the disk space allocated to them. The thin-provisioning for these VMs saved us literally TBs of storage on the new expensive V-Max SAN.

Things to note:

  • After you complete this migration you are not quite done yet. You should typically have your VM tools updated, and also the VM hardware upgraded from v4 to v7. While you will still run fine without these upgrades, it’s always recommended to be up-to-date in that regard, and to also leverage many of the new vSphere featuers like for example memory hot-add (my personal favorite!). The trick here is that you will need a VM reboot to perform that. In our case, for the less critical VMs we scheduled a planned reboots on weekly basis for the upgrades, and for the high-critical VMs, we just wait for the first possible OS reboot and we perform our upgrades along with it.
  • Any storage vendor will tell you to do the thin-provisioning on the array directly, and I kinda agree with them on that, but this is not an option to everyone. Not all arrays come with this feature, or even if they do, not everyone can afford the licensing part. In our case, I simply couldn’t rely on the SAN admins for monitoring and maintaining these thin-provisioned LUNs on the array side, and from the other hand, there were some technical limitations associated with that in terms of SRDF replication or FAST v1 (but that is something specific to EMC, and relevant only to the time of writing this post).

Conclusion:

I will finish this post from where I started. The VMware vSphere is a very powerful and a true enterprise class virtualization platform. You’ve seen here how I was able to migrate the entire VI3.5 environment without one single second of downtime, and also how it was an extremely easy process to migrate our complete storage from one array vendor to another without any interruption in the servers/services whatsoever. There is nothing extraordinary in this scenario (except maybe the embedded video in the diagram), and you’ve seen how easy the steps are, and how everything we’ve done here is built in vSphere itself. Just know your requirement, plan your migration ahead, and you will be just fine!

Share Button