Posts Tagged ‘VMware’

postheadericon Diagram: VMware vCloud Director Networking Architecture

If we are gonna perform Inception then we need imagination. An elegant solution for keeping track of reality – The “Inception” Movie.

 

Before I introduce this new diagram to you, I would like to make a bold statement: No matter how complex this diagram will look to you from the first glance, I can tell you that you’ve been practicing all its core technical concepts for a quite long time. You will just need a bit of imagination and I guarantee to you that everything will make a perfect sense faster than you can imagine. Read on…

Let’s go back to the very first VMware product that has changed the way we think in our IT industry – Workstation! When you create a new virtual machine in Workstation, you get four options for networking:

  • Bridged connection – a pass-through network to the outside world.
  • NAT’ed connection – a network translated connection.
  • Host-only connection – a private network isolated from the outside world.
  • None – no network at all.

Guess what? that’s the core technical concepts I’ve been talking about. Here are the new names that we will be using from now on when we refer to vCD (in the same order):

  • Direct connection
  • Routed connection.
  • Isolated (private)
  • None (no network)

Now, have you seen the “Inception” movie? (if you haven’t, you missed one of the greatest movies in this decade!) Do you remember the layer of dreams in that movie? Well, that is somehow what we have here. Imagine your virtual machines running in different layers of dreams networks, and depending on which layer you are looking at, it might be direct, routed or isolated. Let’s see that from a closer look:

  • First Layer: the real world – this is the actual physical network which we are in most cases not concerned about.
  • Second Layer: the vNetwork Standard Switch, Distributed Switch or even Cisco Nexus 1000V.
  • Third Layer: the External network – this is sort of your gateway to the outer world.
  • Forth Layer: the Organization network – this is sort of the gatekeeper for your VMs. It will always show you what is your logical boundaries.
  • Fifth and last layer: the vApp Network – this is the ultimate end your VM can reach (think LIMBO!)

Now that you have these basic concepts in mind, let’s see what we have in this diagram:

  • This is an A2 size diagram. I’ve really tried my best to keep it in the A3 scale but it’s just not possible with all this amount of information in one place.
  • The diagram covers nearly all the networking options of the vCD but from a “Private Cloud” perspective. In the world of Public Clouds this might be a bit different to layout (which i will do in the future) but the core concepts remains exactly the same.
  • The diagram comes with some text describing the various components and elements. I’m introducing this for the first time here to help you understand what you are looking at instantly without taking your focus away from the diagram.
  • You will see a different PDF layers in the diagram, you can hide/show them as you need. Example: when you are having a closer look into a specific area in the diag, you might find the descriptions useful to have while they might be a bit distracting if you are zooming out to have a holistic view of the diagram.
  • You will see the actual screens of the vCenter networking – the vSS, vDS and the different port groups. Not just that, you will actually see how the VMs in your cloud ultimately look like in vCenter. Add to that all the other components like the External/Organization networks as well as the vShield Edge devices. Of course i’m taking just examples of everything in most cases to avoid the complexity.
  • I’ve included as well the screens of the vCloud Director to show you how the Network Pools looks like along with the other panels of the External and Organization networks.
  • The IP addresses can play a very important role towards your understanding on how all these vApps communicate together. For example, when you see two vApps sharing the same OrgNetwork and still have the exact same IP addressing, it automatically means that they are routed through an edge device.
  • I included three connectivity examples for the outside world of your private cloud. A production cloud, an Internet cloud and an MPLS cloud. Please note that these are just examples not the only options you can have. This is something that can be very specific from a customer use case to another.
  • Last but not least, the vApp networks are laid out like that to fit the best view in the diagram. This is not an attempt to tell you how you should run your vApps but rather show you the different options you have. Again, this is something that is very specific to the customer use cases and requirements.

In the future networking posts on vCD i will start going deeper in the discussion and reference the examples shown in this diagram all the way through. I encourage you to print out this diagram and keep it somewhere near your home/office desk and have a glance through it from time to time. There is nothing better than visualizing something that is as complex rich as the vCD networking. I highly recommend also checking out Duncan Epping’s article on vCD networking, this is a must read for all the vCD newbies.

One more thing. I’d like to give some credit to my colleague at VMware, Massimo Re Ferre’, for showing me the way to understand this great networking topic. Massimo along with Eddie Dinel, Mike D and Vishal Kumar, presented together one of the most interesting presentations I’ve attended for vCD when it was still in Beta. I believe parts of this great presentation have been divided into more than one session in VMworld 2010, so I urge you to go and have a look into the recordings when the sessions are available online.

UPDATE: (14-09-2010): The networking part of the presentation I’ve mentioned above has been re-written by the master at this link. Another MUST-READ.

postheadericon vSphere In Motion: A Real-World Live Migration Scenario

Motivation

I was having a discussion with one of the large enterprises here in Qatar lately, and I was quite surprised to know from them that they are hesitated to migrate their VI3.5 environment to vSphere because of the associated downtime. What surprised me was not the fact that they can’t afford a downtime, I’ve spent 6 years of my career working in the Telecom sector and I know for a fact that 1 second of downtime could mean a disaster, or even translate to a loss of thousand of $$. What surprised me was that they didn’t know that it is possible to do this migration without any downtime!

In this blog post, I will not only show you (and them) how I was able to perform my upgrade without even this single second of downtime, but I will also show how we were able to migrate our storage from one array to another without any service interruption whatsoever in our equally critical environment. To make things even more exciting, what I’m about to show you here is completely achievable using vSphere’s built-in features like VMware Converter, EVC, vMotion and Storage vMotion. There was no third-party tools used in this entire migration.

A brief environment overview

There is nothing better than diagramming this for easier follow-up. In the diagram below I’m illustrating a small portion of the environment showing the main components of the old ESX 3.5 hosts as well as the ESX 4.0 hosts. In our case, we decided not to go with in-place upgrade, and preferred to have a fresh install for the ESX hosts in the new vSphere environment.

You might have noticed that I included a video inside the diagram, and probably wondering why on earth would someone do something like that? The answer is simple: I’m showing-off! No seriously, I know many people (from VMware and specific storage vendors) who use my diagrams in their internal meetings with customers (really I’m not showing-off), and I thought it would be nice to have such small clip in the diagram that shows both the vMotion & SvMotion easy point-and-click approach.

Note: This is just an illustration not an S/vMotion architecture diagram! Wait for my A3 if you are interested to see the technology behind this…magic!

The Process

Step 1: We are running here vCenter on a physical server, and we want to utilize the same hardware for the new upgrade. The easiest way to achieve that is to P2V the existing vCenter 2.5 to another standalone ESX host in our environment. After the VM is migrated successfully and all the clean-up is done, the switch over from the physical to virtual can happen in a matter of seconds by disconnecting the physical server from the network, and connecting the VM (which has the same IP address of course) to the same subnet.

Step 2: Now that we have the vCenter 2.5 migrated, the next step is to perform a clean install on the freed physical server. Starting with the OS deployment, all the way to the vCenter 4.0 installation, initial configuration and licensing.

Step 3: The third step is to connect the new vCenter 4.0 to the old vCenter 2.5 licensing server. This part is important because the ESX 3.5 hosts do not leverage the new and improved licensing model that was introduced in the 4.0 release. This step is quite easy: you go to the “Administration” menu on your vSphere client, select the “vCenter Server Settings”, and then enter your old vCenter 2.5 hostname into the field as shown in the example below.

Step 4: Now we are ready to create a new cluster for the existing ESX 3.5 hosts on the left side of the diagram. The thing to note here is to create the cluster with the EVC mode enabled as shown below because we will be migrating the VMs between two deferent hardware/CPU generations:

Step 5: We create here a second cluster (EVC enabled as well) and add the new ESX 4.0 hosts to it as shown in the right side of the diagram.

Step6: Now, the trick here is to have one ESX 4.0 host in this cluster connected to both arrays in the environment – the EVA and the V-Max. We achieve that by connecting one HBA to the HP SAN fabric, and the second HBA to the EMC SAN fabric. Once this is done, and all the associated zoning and masking is configured, we can scan the HBAs and have all the datastores/LUNs available on this server that we will call it “Gateway”.

Step7: The fun begins. Since the gateway server is having the same shared storage with the ESX 3.5 hosts, all what you need to do here is to drag and drop your VMs from the old cluster to the new one. The vMotion will kick-in and do it’s magic to live migrate the VMs to the new gateway server. That’s right! We are live migrating virtual machines from ESX 3.5 to ESX 4.0 on the fly.

Step 8: Now to my favorite part in the whole migration process. Here we get to experience one of the most amazing features in vSphere – the Storage vMotion. It has been actually re-written with significant performance improvements that made it one of the most powerful tools for any VMware administrator in my opinion, and the best part is that it’s done now with a few mouse clicks through the GUI (checkout the diagram video, or this detailed post). As I mentioned above, we were migrating our workloads from the HP EVA to the EMC V-Max, and we felt quite confident (after intensively testing this in the lab for a week) that the SvMotion would be the best choice for our storage migration. The other reason for using SvMotion was the ability to thin-provision VMs on the fly. I’m not talking here about everything of course, but rather the development VMs that are hardly ever touched. We had so many VMs for our development department with quite huge space requirements, while in fact they are neither actively used all the time, nor they consume the disk space allocated to them. The thin-provisioning for these VMs saved us literally TBs of storage on the new expensive V-Max SAN.

Things to note:

  • After you complete this migration you are not quite done yet. You should typically have your VM tools updated, and also the VM hardware upgraded from v4 to v7. While you will still run fine without these upgrades, it’s always recommended to be up-to-date in that regard, and to also leverage many of the new vSphere featuers like for example memory hot-add (my personal favorite!). The trick here is that you will need a VM reboot to perform that. In our case, for the less critical VMs we scheduled a planned reboots on weekly basis for the upgrades, and for the high-critical VMs, we just wait for the first possible OS reboot and we perform our upgrades along with it.
  • Any storage vendor will tell you to do the thin-provisioning on the array directly, and I kinda agree with them on that, but this is not an option to everyone. Not all arrays come with this feature, or even if they do, not everyone can afford the licensing part. In our case, I simply couldn’t rely on the SAN admins for monitoring and maintaining these thin-provisioned LUNs on the array side, and from the other hand, there were some technical limitations associated with that in terms of SRDF replication or FAST v1 (but that is something specific to EMC, and relevant only to the time of writing this post).

Conclusion:

I will finish this post from where I started. The VMware vSphere is a very powerful and a true enterprise class virtualization platform. You’ve seen here how I was able to migrate the entire VI3.5 environment without one single second of downtime, and also how it was an extremely easy process to migrate our complete storage from one array vendor to another without any interruption in the servers/services whatsoever. There is nothing extraordinary in this scenario (except maybe the embedded video in the diagram), and you’ve seen how easy the steps are, and how everything we’ve done here is built in vSphere itself. Just know your requirement, plan your migration ahead, and you will be just fine!

postheadericon Diagram: VMware High-Availability (UPDATE: v1.2)

I updated the diagram (v1.2) to fix a small typo and adjust also a couple of shapes. Thanks to Joshua Liebster & Bert Bouwhuis for driving my attention to this.

I know everybody skips to the diagram so I’ll save you the introduction, just make sure to quickly go through the notes that follow it:

  • This is not an introduction to the VMware HA, and it’s not a very advanced diagram for it either. I assume here that you have a general idea on the topic before looking into it to appreciate this incredible technology. If you are a VMware professional you may also find this useful to keep your information sharp and present about the topic at any given time. You really don’t have to re-read the documentation every time you’d like to remember a small detail about the subject.
  • I’m introducing in this diagram the “Layers” feature in Visio for the first time. The diagram may look somewhat confusing at the first glance, so I thought that it might be a good idea to use these layers for you to hide/show the topics that you are going through in the diagram. I can see some other use cases for the Layers in future diagrams, so I hope you will like it.
  • This is an A3 diagram, sorry I know most of you just love the traditional A4 from the feedback I get, but seriously, it’s just TMI to fit in A4.
  • Everything you see in this diagram, and specifically for the admission control, is *not* fictitious. This is a real cluster I built specifically before designing this diagram. I wanted everything to be 100% accurate and more importantly: realistic. If you zoom into the middle of the vCenter shape, you will be able to see the actual screenshot of the vCenter interface showing the HA cluster I used, and its runtime information window as well.
  • It’s worth mentioning that this is not all the “advanced options” that you can use for VMware HA. I just selected the ones I thought that might be more frequently used. You can always get back to the official VMware documentation for the complete list.
  • The Admission Control was probably the hardest part not just to visualize it, but also to understand it in the first place! That being said, I do not expect anyone with no prior reading on this specific topic to just get it from the first glance when he/she looks into the diagram. Duncan Epping has an excellent article that I think everyone already knows about it, but it’s worth mentioning that it’s the best place you will ever find for VMware HA in general. The diagram should help you though to understand it faster and easier. You can see all the numbers/calculations in front of you in one shot, and how all these numbers are related to each other.
  • This HA lab was built in nearly 5 minuets and is 100% virtual. Long live Lab Manager 4.0 ! (more details here)

That’s all folks! I hope you will find it useful!

postheadericon vSphere 4.0 Fault Tolerance (Architecture Diagram, Video and Use Cases)

This is a response to the new vSphere Blogging contest that was announced in the middle of this month. I truly think that it’s a cool idea, and I believe that regardless of winning or losing, the excitement and fun a blogger would have during his/her participation is something awesome by itself.

The rules say that my post need to be compact and straight to the point, so I won’t be able to cover all the aspects about something huge like FT. If by any chance I failed to write such a short post, then here are some tips to avoid wasting your time:
1 – If you are one of those people who wear a tie at work, you should jump straight to the “Use Cases” section.
2 – If you are one of those people who are using the words: 10GbE, VMkernel and %RDY, then you probably don’t have time for this, but take my advice and have a look on the next two sections.

Fault Tolerance Architecture Diagram:

From the newly launched vSphere Blog on VMTN I quote this part: “[..] they do say a picture is worth a thousand words [..]“, well, I believe I’ve said that also once before on a previous post, and in fact this is the whole concept my blog is built on. So here is a blueprint for the FT with my own tweaks to save you (and me) a thousand words describing how FT works and how it’s architectured.. (BTW, this is the first of vSphere blueprints to come):

 

Fault Tolerance Video Demonstration:

Thank god the contest rules mentioned the possibility of republishing an old content; otherwise I would have been rerecording and video editing this from the scratch, which is a kind of nightmare. I published this video back in April this year when the vSphere was just announced by VMware (the bits were not even available for download at that time), this means it’s one of the very first videos ever published about this cool new feature.

Before you hit the play button, let me tell you why this video is deferent from many of the other ones that I’ve seen later on:
1 – In my scenario, I have three ESX hosts in a cluster rather than two as you may see in most of the FT demonstrations. What is so special about that? Well, it clearly shows the true concept of the “continues availability”, where in case of a complete ESX host failure, the FT will not just failover to another host, but will also automatically assign a third host in the cluster to protect itself in case of another host failure until the SysAdmin attend to the incident.

2 – I’m using in this video a continues file copy to the protected VM throughout the host failure process. This is to show you a “real-life” scenario where your VM is busy doing something critical (backup for example). You really don’t play movies in your mission critical VMs (I think Microsoft is the one who invented this idea in their Hyper-V live-migration demonstration, kind of weird!)

 

My Real-life FT use cases:

I’m taking off now my “VMware Evangelist” hat, and putting on the “VMware Customer” hat. What you’ll read here is my real-life use cases for the FT, no marketing talk, no political debates. This “is” the real deal:

1 – Blackberry Enterprise Server & RoveIT Mobile Admin:
BES is one of our most business critical applications because it’s being used by our higher management in their day-to-day communications. Initially we were depending on HA since we didn’t think that our luck would be that bad to have an ESX host failure while one of the executives sending an email.

This continued to be the case until we deployed the RoveIT Mobile Admin & vCenter Mobile Access (with BES/MDS in the backend). We basically wanted to have a 24/7 access for our SysAdmins to our entire IT environment (including VMware) while they are on the go, using their Blackberry smart-phones (given by the corp for this specific purpose). This was mainly to improve our response time for emergency situations, and of course this service makes no sense unless it can tolerate the most severe situations of hardware failures. Enabling FT on both the BES and the Mobile Admin VMs allow us, from one hand, to ensure that our executives will never complain that they can’t use their Blackberry whenever they need, and that “IT Suck”. From the other hand, we, the IT suckers..er..i mean SysAdmins & consultants, can have a piece of mind that we will always be able to get to our backend systems wherever there is a problem that requires an immediate attention.

2 – ManageEngine Application Manager:
We heavily depend on the ManageEngine Application Manager in our environment, where we get real-time emails and SMS notifications for any issues happening either in the OS layer (e.g. disk usage, service status ..etc) or the applications (e.g. Exchange high local queue, MS SQL DB issues ..etc). In order to maintain this level of real-time notifications, we had to put this application in a very high availability. Although the application comes with optional cluster capabilities, the VMware HA really was doing this trick without paying extra money. In both cases (the cluster option or the ESX HA) if an ESX host fails, we will have to wait approximately 10min for the application to be powered and operational on another host in the cluster. This is not realistic for an application that is supposed to tell us that the ESX has failed at the first place. With FT we are able to have the application up & running all the time with no interruption whatsoever, and consequently send us the notifications of any Host/OS/Application issues no matter what happens across the underlying infrastructure.

3 – Custom Application – Online payment gateway:
We have an online customer payment service consisting of a custom written application integrated with the IBM Websphere MQ and a backend Oracle DB. Everything is in high availability as you would expect, except for the custom application! I must add also that it is poorly written that it needs human intervention every time the VM needs to be rebooted in order to bring it up again. That being said, HA is not even an option in case of host failures. Unfortunately the application developer does not know how to address these issues in his application, and we are stuck with that fact since he’s working with the same backend payment gateway provider. We came up with two solutions for that:
a) The Long run: gradually migrate the online service to a new system with a new backend payment gateway. We are around 30% now on this new service.
b) The short run: put the custom application on FT enabled VM where we don’t have to suffer from any unplanned downtime associated with the VM and/or the host.

 

The conclusion:

FT is a “must have” not a “nice to have” feature in any environment. I don’t really understand the big debate around it from the so-called “experts” who have been flooding us on twitter or the blogosphere about reasons why it’s “not enterprise ready yet”. Most of these debates are coming really from people who have not seen enough of these enterprise environments they are talking about and the challenges we have every day with scenarios like the ones I’ve listed above. Surely enough, FT has a quite long list of limitations that you can find on any of these blog posts (or on the VMware website itself), but you should also know that VMware is working on most of these limitations in future releases. The number one limitation that you will always hear about is the (1 vCPU) restriction for the FT enabled VM, well, let me tell you two things about that to finish up my article:

1 – The vast majority of the applications running in any datacenter do not need, or even make use of SMP. My three use cases above are examples for that.
2 – VMware has published recently this blog post showing how a 1vCPU VM based on the revolutionary Intel Nehalem processor, can perform better than 2vCPUs using older generations.

P.S. This is probably one of my largest blog posts. I’m disqualified from the contest.

postheadericon Diagram: VMware vSphere 4.0 in The Enterprise

I’m a big believer in the saying “A picture is worth a thousand words“. If you don’t believe in that, then this blog will never be the right place for you. I think there is a fair amount of my blog readers who had actually visited me in my office, and they’ve seen how I have all sorts of diagrams covering the walls, starting from the infrastructure and solutions architectures, all the way to detailed blueprints for the deferent technologies that I implement in my environment. Beside the extreme fun I have designing these diagrams, just looking at them on daily basis help me identify the areas of improvement and future developments quite easily. Why am I telling you this small story? Well, you are going to see many of this stuff coming on my blog more than any time before folks!

Introducing the “VMware vSphere In The Enterprise” diagram v1.0

Disclaimer: This is a very, very high level “visualization” of the “virtualization” architecture using VMware vSphere. Having said that, this should never be taken for granted or looked at as the perfect design for your vSphere environment. There is no such thing as a “perfect design” at the first place. There is always a customer requirement, and best practices that we follow to achieve the “perfect solution” for the customer. I can’t stress enough on this point as I know there are many VMware-newbie visitors on my blog who might be caught in this trap.

A word of appreciation: I’d like to thank Duncan Epping for his great work of choosing the (Top 5 Planet V12n blog posts), which even for me, as a good follower of that RSS feed, I always miss quite a few great posts in there. In week 29, Duncan selected this post that had an incredible list of network ports in the VMware environment, which I have used some of them in my diagram above. It’s not the complete list of course since it’s out of my diagram scope, although I do intend to do a complete “block diagram” in the future to visualize the entire list.

Printing Considerations:
In case you haven’t noticed, this is an A2 scale diagram. I’ve initially tried to fit it into A3 while designing it but I couldn’t. The amount of information and layouts were just too much to fit in the A3 scale. The diagram still prints well on A3, but you’ll have a hard time reading some parts like the port numbers. That said, I highly recommend that you print it on an A2 plotter, which will give you the real look and feel of the diagram. In my case, although we have in our GIS department many plotters for printing even larger scales like A1 and A0, I just went to the nearest Xerox center and printed it there just to make sure how it will look like in commercial printing centers, and the printout was phenomenal.


My name is Hany Michael and I’m a Senior Consultant at VMware. I blog about various topics ranging from the core vSphere technologies all the way to the vCloud based products. (Read more)
Disclaimer
Any views or opinions expressed on this blog are strictly my own and not the opinions and views of my employer.