Recently I have been experiencing some troubles with VMware Host Profiles, Auto Deploy and the stateful install of ESXi 5.1 (Update 1). After the Host Profile gets applied, the system eventually times out with the message: “The request failed because the remote server took too long to respond” as indicated on the screenshot below.
I will describe the step-by-step deployment process in the next lines so you will get some more insights in them:
1. ESXi Host gets added to vCenter Server as configured by the Auto Deploy DeployRule
2. The specified Host Profile gets attached to the ESXi Host (also, as defined in the DeployRule)
3. The Host Profile is configured with the following setting to ensure stateful install:
- System image cache profile settings = “enable stateful installs on the host”
- Arguments for first disk = “local” (or “megaraid_sas”, both settings point out to the local disk in our hardware setup)
4. After pressing “Apply Host Profile” the system starts applying the Host Profile which eventually times out as indicated on the screenshot displayed above.
So what’s going on?
Troubleshooting pointed out to the /var/log/syslog.log which is showing these entries while the Host Profile is getting applied:
2013-09-11T06:23:10Z HostProfileManager: [2013-09-11 06:23:10,619 root INFO] Discovered lun — naa.600507680181058710000000000000cf (console /vmfs/devices/disks/naa.600507680181058710000000000000cf) — IBM 2145 (2094080 MiB, qla2xxx)
2013-09-11T06:23:10Z HostProfileManager: [2013-09-11 06:23:10,670 root INFO] Discovered lun — naa.600605b001ed1280194c9cc156f4c3be (console /vmfs/devices/disks/naa.600605b001ed1280194c9cc156f4c3be) — IBM ServeRAID M5015 (68664 MiB, megaraid_sas)
2013-09-11T06:23:10Z HostProfileManager: [2013-09-11 06:23:10,671 root INFO] Scanning naa.600605b001ed1280194c9cc156f4c3be for any installs …
2013-09-11T06:23:13Z HostProfileManager: [2013-09-11 06:23:13,419 root INFO]
2013-09-11T06:23:19Z HostProfileManager: [2013-09-11 06:23:19,421 root INFO] Found nothing on naa.600605b001ed1280194c9cc156f4c3be.
2013-09-11T06:23:19Z HostProfileManager: [2013-09-11 06:23:19,422 root INFO] Scanning naa.600507680181058710000000000000c8 for any installs …
2013-09-11T06:23:20Z HostProfileManager: [2013-09-11 06:23:20,230 root INFO]
2013-09-11T06:23:26Z HostProfileManager: [2013-09-11 06:23:26,233 root INFO] Found nothing on naa.600507680181058710000000000000c8.
As shown, the HostProfileManager detects the local disk and subsequently starts scanning every attached LUN “for any installs…..”. Since our ESXi Hosts have lots of LUN’s connected, this scanning takes so long that the Applying Host Profile-process eventually times out.
Currently VMware Support is working on this issue as they have acknowledged this to be a bug. To be continued…