VMware vSphere on IBM BladeCenter H – (Part 1 of 2)

Important: In case you haven’t done that already, please take a moment to read the first post of this series.

Due to the insane number of expansion modules/options available in the IBM BladeCenter H, I had to split this post into two parts. In fact, I was initially planning to have around 12 different designs for vSphere on BladeCenter H (yes twelve) but I then I started to shrink and skip some designs to fit as many scenarios as possible in a reasonable two-part article. With that said, the following is by no mean a list of all the possible design scenarios you can achieve with this hardware platform. If you started the “mix and match” game, you may literally end-up with uncountable possibilities!

The Diagram

Here is some important notes before using the diagram:

  • You will see different configurations in this post and the relevant architecture of each configuration in the diagram. This is done through the PDF layers, which basically means than you should *not* activate more that more layer in the same time.
  • By default, “Configuration 1″ is the first active layer when you open the PDF file. You can show/hide the other layers by simply clicking on them. Again, you should only show one layer/configuration at a time.
  • You will always see two boxes on the right side of the diagram, the upper one will show you the current vSphere configuration, and the lower one will show you the relevant hardware configuration. You should typically start looking at those two boxes before scanning through the diagram to understand the “ingredients” of the design.
  • At the time of writing this post, you will see four configurations only in this diagram, however, when I publish the second part, there will be additional configurations that I will add to the existing ones. In other words, the diagram will be updated later on to have those additional configurations so keep that also in mind.

The common design and configurations

You will find in most of the configurations a common design, unless I explicitly state otherwise. I will list them here in details:

The Clusters:

You will see two type of clusters:

  • Management Cluster: it is typically a two node cluster running the management and infrastructure services. For example, if you want to virtualize the vCenter Server, the VM should be running on this cluster rather than the actual production clusters. Same thing holds true for other vCenter products like: AppSpeed, CapacityIQ, SRM and so forth. There are two reasons for doing that: the first, we don’t want to run into the problem where vCenter Server is not accessible (there are some examples published in the community but my favorites are Jason Boche’s Catch22s!). The second reason, we don’t want to either affect our workloads’ performance with our management virtual appliances or vice versa.
  • Production Clusters: You can see here two production clusters (Cluster A and Cluster B). The take away from that is the following:
    • You don’t have to stick with that number of hosts per cluster, it depends on what you want to achieve, and also on some configuration maximums that may or may not limit you.
    • The nodes have to be spanned across the two chassis as numbered and illustrated in (Config 1). There are two reasons for that: Firstly, you don’t want your whole cluster to fail in an unluckily event when a whole chassis fails. Secondly, you have to keep in mind that VMware HA selects the first 5 hosts in the cluster and promote them as a “Primary” nodes, if they fail, your HA cluster fails.

The Blades:

You will see two consistent blades throughout the first four configurations, the HS22 and the HS22V. Both blade servers share the same IO expansion capabilities, however, there are a some differences between them. For example, the HS22V has no hot swappable HDD but it is superior in the memory capacity (144GB compared to 96GB in the HS22). In part-two of this article, I’ll talk in details about the new HX5 and what it can bring to the table in terms of scalability.

The Expansion cards:

Every HSxx blade comes with two onboard 1Gbps Ethernet ports for basic networking. They will always show in vSphere as vmnic0, and vmnic1. These ports are in turn mapped to Bay 1 and Bay 2 in the chassis. Of course no one recommends implementing vSphere using 2 x 1GbE ports in an enterprise environment (although it will technically work), so we will use here what we call: expansion cards. There are two slots for expansion cards in any HS22/V blade, the first one is called CIOv (for vertical expansion modules) and CFFh (for the horizontal fast IO modules). The CIOv is usually used with the FC HBAs (although we will see later how we will utilize it for iSCSI connectivity), and they are mapped to Bay 3 and Bay 4 in the chassis. The CFFh on the other hand is mapped to four fast expansion modules (7, 8, 9 and 10). I say fast because this is the only card that can leverage the 10GbE connectivity (or Infinibad but it’s not relevant to our series). Depending on the configuration, you will see how we will use different cards to support our designs, however, the onboard 2 x 1GbE port will be always common, and always there.

Now that we’ve talked about the common stuff, let’s start talking about the unique configurations. Oh yes, we were just warming up!

CONFIGURATION (1):

We have in this configuration 6 x 1GbE pNICs per blade to support our MGMT, VMkernel and Virtual Machine networks. We teamed three pNICs here in a vNetwork Standard Switch (vSS) to serve the SC, vMotion and FT. The other three pNICs are teamed in a vNetwork Distributed Switch (vDS) to serve the VM networks. Let’s dig litter deeper on how this is done.

As mentioned earlier, we have three type of IO ports on the blades: the onboard ports, the CIOv, and the CFFh. In order to achieve the maximum availability, we teamed one onboard port with a couple of ports from the CFFh card. In this case, if we had a failure in any IO port (on board or expansion card) we will be able to tolerate that failure.

The second consideration here is to distribute the load and bandwidth for our networks. For example, the SC network will be active on vmnic0 and standby on vmnic1. The vMotion will be active on vmnic1 and standby on vmnic0 and so forth.

You may have noticed also that we grouped the SC + VMkernel network on a vSS, while we grouped the VM networks on a vDS. The reason behind that is to ensure that you would still be able to control your SC network even if your vCenter fails. For the VM networks, you would still leverage the great enhancements and features of the vDS. This is *not* a best practice from VMware, and as far as I know there is no documentation recommending that. It is up to you whether you would go with that setup or simply have everything on a single vDS.

CONFIGURATION (2):

This is nearly identical configuration except for the IP SAN. In Config1 we were running on a FibreChannel SAN, while in this configuration we have an iSCSI. The thing to note here is that you will need to install your Ethernet expansion modules in Bay 3 & 4. We will swap also the CIOv card from being a FC HBAs to a traditional 2 x 1GbE card. Of course you will use in this case the vSphere iSCSI initiator for doing your storage networking. This is fine in nearly most cases, except the one where you will actually need to boot your ESX server from SAN.

Please also note there that you can use NFS with the same layout. Your 2 x 1GbE blade ports + the 2 x expansion modules (bay 3 & 4) will all serve your NFS requirement in a high availability design.

CONFIGURATION (3):

What you will see in this configuration is something a bit different. We are using here a 2 x 10GbE ports through the CFFh expansion card to serve “all” our networks. This card is mapped to two 10GbE expansion modules sitting in Bay 7 and Bay 9.

The trick here is this: how can you have a proper network segmentation if you are using two pNICs only? The answer, of course, is VLANS. As you see in the diagram, we have two production networks and one lab network. All these networks are tagged with a VLAN ID to flow the traffic through the vmnics to pNICS all the way to your enterprise/core switches. The ports on your core switches need to be of course in trunk mode.

Now, the second question here would be this: how can you ensure that no network will saturate the whole link and affect the performance of the others. The solution for that is to use the vSphere traffic shipping. You can simply dedicate the bandwidth to each “port group” per your requirement. Example, for SC you normally don’t need more than 1Gbps. For vMotion and FT you would definitely require more bandwidth. To keep things simple, I illustrated in the diagram how the segmentation and bandwidth allocation can be distributed across the two links in an Active/Standby approach.

You will notice here also that we are utilizing the two on board Ethernet ports to have an additional iSCSI SAN (for the Lab environment for example) along with the FC SAN for your production workloads.

CONFIGURATION (4):

In the previous configuration we saw how we leveraged the VLANs to do our network segmentation and how that was quite easy and flexible. But what if the customer has a policy not to use VLANs to consolidate the networks (for a security reason as an example)? Easy, we would still be able to comply with that. Basically we will need to swap here the 2 x 10GbE CFFh card with a 4 x 10GbE card and of course add additional two 10GbE expansion modules to Bay 8 and Bay 10.

Now, what did we achieve by doing that? Two things:

1 – We are compliant with the customer requirement to have a physical segmentation between the Management/FT/vMotion networks and the production networks.
2 – We are using the vSS for our management network while leveraging the vDS for our Virtual Machine networks.

You have also here another two options that were not included in the diagram. You can make use of the two onboard ports to have an additional iSCSI SAN as we did in the previous configuration, or, you can use them as a standby ports for your Management/VM networks in case of a CFFh card failure. Do you see now what I meant above by the “mix and match game”?

Coming Soon – Part 2:

I’ll talk about the new HX5 and how you can have a lot more memory or extended IOs to support special workloads or strict design requirements. I will talk about FCoE and CNAs. I will also talk about the new & promising Virtual Fabric from IBM, and how you can basically slice your pNics into almost any protocol or speed you want.

Stay tuned!

Share Button