Running scripts on the ESXi shell with correct time-stamps

While working on the ESXi shell you might notice that the “date” command returns the UTC time instead of displaying the correct time zone like ESX Classic does.

This is by design within ESXi and stated in this VMware KB article: ESXi uses UTC time and does not support changing time zones

Although it’s not supported to change the time zone it could come in handy to get the correct time-stamps for custom scripts within the ESXi Shell.
To get the correct time-stamps you only need to add the following line in the beginning of your script:

export TZ=MET<-/+><no of hours>

In my case UTC is off with -1 hour with the actual time, which means that I need: export TZ=MET-1

Please note that this command only changes the time zone for the current Shell session as long as it’s active, so it doesn’t change the system time zone.
Also don’t forget to check the output with daylight saving time if applicable to your current time zone.


Dutch vBeers – 14 July 2011

Another vBeers gathering will be held on Thursday 14th of July starting from 6:00pm in ‘Cafe de Omval’ which is located near the Amsterdam Amstel station. This venue is selected since it’s  easy to reach for people not coming from Amsterdam and serves a fine of selection of beers along with soft drinks and bar food.

Note that  Frank Denneman, one of the two great authors of the VMware vSphere Clustering Technical Deepdive (available now!) will also be attending. This is a great moment to meet up with him!

Drinks will not be paid for, there will not be a tab. When you buy a drink please pay for it as no one else will be paying for your drinks.

  • Location: ‘Cafe de Omval’ Amsterdam
  • Address: Weesperzijde 250 (hoek Omval), 1097 EB Amsterdam
  • Nearest Train Station: Amsterdam Amstel
  • Time: 6:00pm
  • Location: Map

VM Swapfile (.vswp) placement with SRM

Nowadays VMware Site Recovery Manager (SRM) gets implemented more and more and like vSphere, VMware SRM needs a good architectural design before starting off.

One of the design considerations is around the placement of the Virtual Machine Swap File (.vswp) which I want to give some more information about in this article.

Let’s first take a look at the VM Swap File (.vswp), by default this file is placed in the VM “working directory” which also contains all the other VM files. The .vswp is created every time the VM is started and equals the size to the unreserved memory configured on the VM. If the VM is configured with 2 GB and memory reservation is set to 0 MB (default) the VM Swap File will be 2 GB. If memory reservation in this example would be 1 GB than the .vswp file will be 2GB – 1GB = 1 GB total.

The design considerations are about:

  1. Keeping the .vswp file in its default “working directory”;
  2. Placing the .vswp on a separated non-replicated datastore.

Keeping the .vswp file in its default “working directory” means that the .vswp file will be replicated to the recovery site as indicated in the next overview:

Pros:

  • Ease of manageability, all VM files are together and it’s default;

Cons:

  • More replication bandwidth is needed for files (.vswp) that aren’t used at the recovery site;
  • Cost is higher since more replicated storage space is used;
  • Increases the recovery speed of both the test and real failover. This is due to the fact that SRM explicitly deletes the useless .vswp files on the recovery site before starting the VM’s.

Placing the .vswp on a separated non-replicated datastore involves some manual work, possibly even reconfiguration of all the current available Virtual Machines. The following overview shows the configuration:

Pros:

  • Does not consume unused storage replication traffic;
  • Uses less replicated storage, which could be more expensive than non-replicated storage.

Cons:

  • Can be more difficult to manage because some parts of a virtual machine reside on a separate datastore.
  • Requires additional configuration and management processes within SRM since the VM would be detect with a non-replicated datastore which consequently causes SRM to remove the VM from its protection group.

An important note that needs to be made is around NFS storage. As indicated, one of the drawbacks on keeping the .vswp file in the “working directory” is the fact that this increases the recovery speed since SRM deletes the replicated, useless, .vswp file before starting the VM.

Deleting the .vswp file from a newly recovered NFS datastore can take up some time since ESX needs to wait for the replicated file lock to expire (default 35 seconds).  A quote from the the Best Practices on NAS Whitepaper:

Once a lock file is created, VMware periodically (every NFS.DiskFileLockUpdateFreq seconds) send updates to the lock file to let other ESX hosts know that the lock is still active. Changing any of the NFS locking parameters will change how long it takes to recover stale locks. The following formula can be used to calculate how long it takes to recover a stale NFS lock:

(NFS.DiskFileLockUpdateFreq * NFS.LockRenewMaxFailureNumber) + NFS.LockUpdateTimeout

If any of these parameters are modified, it’s very important that all ESX hosts in the cluster use identical settings. Having inconsistent NFS lock settings across ESX hosts can result in data corruption!

This timeout isn’t applicable on VMFS datastores because the auto-resignaturing process  drops the file locks automatically.

As with every design decision it’s all about knowing the pros/cons of the available options you have and as such select the best option for your environment.

Self-employed, delivering Virtualization Consultancy

After working for VMware for over a year I decided to take the next step in my career: to start Van Ditmarsch Consultancy

As a contractor I will be available to do Virtualization Consultancy, primarily focused on designing, implementing and/or validating VMware Virtualization solutions with a strong focus on business continuity, high availability and disaster recovery.

I want to thank VMware for their understanding and I want to thank all the great people I met while working for VMware since they definitely have the best people around!

Thanks guys, all the best and we will certainly stay in touch!

 

How to get Device Identifiers for I/O Devices

Recently I got a few questions about the VMware HCL and how to verify I/O devices without knowing all the key details of  the device. Let’s start of by looking at the VMware HCL, note that there are additional fields on which a search can be done for I/O Devices.

The Device Identifiers are separated into the following items:

VID Vendor ID
DID Device ID
SVID Subsystem Vendor ID
SSID Subsystem ID

Getting the Device Identifiers on your ESX(i) Host is done by taking a look at the current running PCI config by issuing the following command:

For an ESX Host: cat /proc/vmware/pci

For an ESXi Host: lspci –p

The output will display all the Device Identifiers needed to verify the I/O device against the VMware HCL. (in this example the details of a HBA).

More information on Device Identifiers in combination with PowerShell can be found in this excellent article of Luc Dekens.

VMFS3 Heap Size (MaxHeapSizeMB)

This article gives you more information on the VMFS3.MaxHeapSizeMB advanced parameter. Let’s start off with VMware KB article 1004424 describing the method to increase the VMFS3 Heap Size in case you see the following error message in /var/log/vmkernel or /var/log/messages log:

vmkernel: 8:18:59:58.640 cpu2:1410)WARNING: Heap: 1370: Heap_Align(vmfs3, 4096/4096 bytes, 4 align) failed. caller: 0x8fdbd0
vmkernel: 8:18:59:58.640 cpu2:1410)WARNING: Heap: 1266: Heap vmfs3: Maximum allowed growth (24) too small for size (8192)

Running out of VMFS3 Heap Space can occur when a large quantity of virtual disk space (.vmdk files) is open on a single ESX Host.

Read the full post »

vCenter Performance Tab – Real-time vs. Historical Data

Recently I received a question about the Performance tab within vCenter which intrested me. Looking at the available Chart Options we can see that there are two distinctive performance streams:

  • Real-time data
  • Historical data

Read the full post »

Design VMware FT Network, active/active or active/passive

I recently had a design discussion about the best way to configure the VMware Fault Tolerance Logging network. During the discussion we quickly established that you want to assure redundancy for the VMware FT Logging network. The most interesting part of the discussion was how to configure the vSwitch dedicated for FT traffic.

Read the full post »

VMware vCenter Port 80 invalid or already in use

While installing VMware vCenter 4.x with Microsoft SQL 2008 (R2) installed on the same system you might get the following error message: The following port numbers are either invalid or already in use. VMware VirtualCenter HTTP Port: 80

By default vCenter wants to use port 80 for the HTTP service and using the netstat command you can verify which process is listening on port 80. In my case Windows wasn’t able to show the actual process name other than Process ID 4 which basically tells that the “system” is using the port.

The netstat -abo output displays: “Cannot obtain ownership information” and only shows Process ID 4

Read the full post »

BIOS settings in a VMware ESX(i)/vSphere Environment

Over the last period I’ve noticed that a lot of customers aren’t aware of BIOS settings that can be changed as a best practice for VMware ESX(i)/vSphere environments. In this article I want to outline some of these best practices specifically based on HP Hardware, but feature-like options are available within other vendors as well.

ASR: The Automatic Server Recovery (ASR) feature is a hardware-based timer. If a true hardware failure occurs, the Health Monitor might not be called, but the server will be reset as if the power switch is pressed. The ProLiant ROM code may log an event to the Integrated Management Log (IML) when the server reboots.

Personally I prefer to disable ASR since it’s possible that an ESX Host crashes (Purple Screen of Death – PSOD), VMware HA starts VM’s on the other hosts, ASR restarts the server (default timeout: 10 minutes), the ESX Hosts gets reconnected to the cluster, VMware HA is automatically reconfigured and based on the load DRS could move VM’s back to the host, waiting for another crash to occur…

Read the full post »