Auto Deploy: ESXi Stateful Install with multiple LUNs connected fails

Recently I have been experiencing some troubles with VMware Host Profiles, Auto Deploy and the stateful install of ESXi 5.1 (Update 1). After the Host Profile gets applied, the system eventually times out with the message: “The request failed because the remote server took too long to respond” as indicated on the screenshot below.

Host Profile Timeout

I will describe the step-by-step deployment process in the next lines so you will get some more insights in them:

1. ESXi Host gets added to vCenter Server as configured by the Auto Deploy DeployRule
2. The specified Host Profile gets attached to the ESXi Host (also, as defined in the DeployRule)
3. The Host Profile is configured with the following setting to ensure stateful install:

  • System image cache profile settings = “enable stateful installs on the host”
  • Arguments for first disk = “local” (or “megaraid_sas”, both settings point out to the local disk in our hardware setup)

4. After pressing “Apply Host Profile” the system starts applying the Host Profile which eventually times out as indicated on the screenshot displayed above.

So what’s going on?

Troubleshooting pointed out to the /var/log/syslog.log which is showing these entries while the Host Profile is getting applied:

2013-09-11T06:23:10Z HostProfileManager: [2013-09-11 06:23:10,619 root     INFO] Discovered lun — naa.600507680181058710000000000000cf (console /vmfs/devices/disks/naa.600507680181058710000000000000cf) — IBM      2145             (2094080 MiB, qla2xxx)
2013-09-11T06:23:10Z HostProfileManager: [2013-09-11 06:23:10,670 root     INFO] Discovered lun — naa.600605b001ed1280194c9cc156f4c3be (console /vmfs/devices/disks/naa.600605b001ed1280194c9cc156f4c3be) — IBM ServeRAID M5015 (68664 MiB, megaraid_sas)
2013-09-11T06:23:10Z HostProfileManager: [2013-09-11 06:23:10,671 root     INFO] Scanning naa.600605b001ed1280194c9cc156f4c3be for any installs …
2013-09-11T06:23:13Z HostProfileManager: [2013-09-11 06:23:13,419 root     INFO]
2013-09-11T06:23:19Z HostProfileManager: [2013-09-11 06:23:19,421 root     INFO]   Found nothing on naa.600605b001ed1280194c9cc156f4c3be.
2013-09-11T06:23:19Z HostProfileManager: [2013-09-11 06:23:19,422 root     INFO] Scanning naa.600507680181058710000000000000c8 for any installs …
2013-09-11T06:23:20Z HostProfileManager: [2013-09-11 06:23:20,230 root     INFO]
2013-09-11T06:23:26Z HostProfileManager: [2013-09-11 06:23:26,233 root     INFO]   Found nothing on naa.600507680181058710000000000000c8.

As shown, the HostProfileManager detects the local disk and subsequently starts scanning every attached LUN “for any installs…..”. Since our ESXi Hosts have lots of LUN’s connected, this scanning takes so long that the Applying Host Profile-process eventually times out.

Currently VMware Support is working on this issue as they have acknowledged this to be a bug. To be continued…

Leave a comment

7 Comments

  1. Mike P

     /  May 1, 2014

    Did you ever get a fix to this?

  2. Kakoon

     /  May 15, 2014

    Hi, we have the same problem, have you resolved this? and how?

    Thank you for your reply!

  3. still same problem 🙁

  4. gans

     /  July 12, 2014

    Any luck..?Pls update if you have any fix on this issue…

  5. still open with vmware :/

Leave a Reply

%d bloggers like this: