After rebooting several Virtual Connect Modules to test the failover behaviors I got myself in a situation in were the Virtual Connect Manager got completely unresponsive. In my case the vcutil from HP eventually solved my problem so I want to give some more information on this tool since I only knew it as a Virtual Connect Firmware Update Tool from the past.
In my case the following statements were true:
- While logging on with the web browser the Interface the “Loading, please wait…” wouldn’t disappear.
- While logging on with SSH, I was able to enter credentials but after that the CLI never appeared and eventually timed out.
Please note that I only lost management since the Virtual Connect Manager wasn’t functioning. All the network traffic in the enclosure was working fine.
Since I couldn’t get in with SSH my “reset vcm –failover”- option doesn’t help me out either.
On HP’s instructions (credits to the HP Technical Consultant who helped me out on this one) I download the Virtual Connect Support Utility, which is described by HP like:
“This utility allows users to remotely upgrade the HP BladeSystem c-Class Virtual Connect module firmware.”
But in fact it’s much more than just an upgrade tool for the VC firmware. This tool currently can handle the following actions:
- Version List
- Report
- Update Firmware’s
- Discover
- Collect Diagnostic Information
- Create a Config Backup
- Create a Support Dump
- Display Package Info
- Reset VCM
- Health Check
To start of I used the Health Check-action by issuing the command: vcutil -a healthcheck –I <primary OA IP> -u <OA Admin Account> -p <OA Password>
In my case the output stated:
WARNING: No Primary VC Enet module found.
Module Configuration may be inaccurate.
——————————————————————————
Bay 1: HP VC Flex-10 Enet Module
——————————————————————————
Power: On
Health: Ok
IP Address: x.x.x.x
IP Connectivity: Passed
Mode: Primary
Domain Configuration: In Sync
Module Configuration: Not In Sync
So my first instruction was to use the vcutil -a resetvcm command which didn’t solve the issue;
Secondly I tried creating a Support Dump (vcutil –a supportdump) which gave me: “Saving Virtual Connect Support Information…Error Could not get Primary VCM IP Address from Bay 1”;
My last and successful instruction contained some physical handling on the enclosure:
- Remove module from Bay2
- Reset VCM using vcutil -a resetvcm
- Plug module that was in Bay2 in any free interconnect bay except Bay2 – this will reset module configuration
- Wait until the LED’s are on – this can take up to a couple of minutes
- Connect to VCM to check if the connection is back
- Return the module back to Bay2 and again check connection to the VCM
I walked through this last instruction without downtime for the production environment since it’s designed redundant and tested in advance 🙂
Stefan Jagger
/ January 24, 2010Some useful information in case I ever get caught in that situation. Cheers!
Gunnar Andersson
/ January 27, 2010Very useful information. I had exactly the same issue bit in my case it worked
with vcutil -a resetvcm.
Bill
/ February 9, 2010When you run the vctuil -a resetvcm , did you experience any drop in network connectivity to the blades?
Kenneth van Ditmarsch
/ February 10, 2010Hi Bill,
Well actually the resetvcm from the vcutil didn’t work in my environment. What I do know is that resetting the VCM from the interconnectmodule (via SSH) doesn’t drop connections.
I can’t imagen that it will drop since it’s only the “management process” that you are resetting but I’m not 100% sure unfortunately.
Kenneth
Gunnar Andersson
/ April 7, 2010I Have done this several times with success. I run vcutil -a resetvcm -b 1 and after completetion I wait a few minutes and run vcutil -a resetvcm -b 2 and after completetion and a few minutes waiting the VCM works Ok
LG
/ April 19, 2010Hi Kenneth,
I’m facing more or less the same problem : after a failover as doing a reset through the virtual button in the OA interface, i still have my ethernet connections, but vcm web interface is only available with factory credential and after logging i got the configuration wizard, like if it was the first time.
What are your firmware versions ?
Kenneth van Ditmarsch
/ April 19, 2010Hi LG,
We were using 2.10 by that time.
Cheers,
Kenneth
Suhail
/ May 9, 2010This information helped me in resolving my VC issues. I have followed Gunnar’s suggestion of reseting the bays one at a time. Thanks Kenneth for sharing the information.
Shadders
/ May 28, 2010Hi, all this information is really helpful, but I cant find a 100% answer to my question: If I use the vcutil -a resetvcm -b 1 & then the vcutil -a resetvcm -b 2 commands (I have 2x Flex-10 modules) to regain access to the VC modules, will the Blades loose network connectivity. Some people say Yes – others say No. I just need to be clear. Does it restart the internal Web service on the VC module – or reset the whole module?
Any help would be great!
Thanks
Steve.
Kenneth van Ditmarsch
/ May 28, 2010I unfortunately don’t have the answer 100% sure (i’m not on that site anymore) however, I did send out an e-mail to the HP technical guy that helped me out last year. Whenever I get response I’ll let you know.
Kenneth van Ditmarsch
/ May 28, 2010Hi Steve,
I’ve contacted the HP guy again and this is his answer. Hopefully you are statisfied with the answer 🙂
This command only change which Virtual Connect Ethernet Module is hosting the Virtual Connect Manager. The feature can also force the Virtual Connect manager to restart without switching to the alternate Virtual Connect Ethernet module (in the situation of a configuration with one VC Ethernet module) This feature can be useful when troubleshooting the Virtual Connect manager.
The network and FC processing of the Virtual Connect subsystem is not disturbed during the restart or failover of the Virtual Connect Manager.
Gareth
/ August 11, 2010Hi Kenneth,
Slightly off-topic, but saw your excellent posts on Flex-10 and the experiences you document are very much in line with my own. I am interested to know if you have tested iSCSI perfromance with Flex-10? I have been doing some very extensive testing and am running into significant perfromance problems with iSCSI traffic, but other TCP traffic seems fine.
I won’t go into detail here, but am keen to compare notes if this is an area you have looked at. You have my email.
All the best!
Gareth
Kenneth van Ditmarsch
/ August 11, 2010Hi Gareth,
We did some iSCSI performance testings but the bottleneck in this environment were the LeftHand Nodes. Everything was designed for full 10 GB throughput except for the LeftHand Nodes since they only had 2x 1 GB NIC which was configured for NLB.
Our IOmeter tests showed us that we were able to write to iSCSI as fast as network allowed us.
We also enabled Jumbo Frames to test with that but we didn’t noticed any difference in performance. This could either have reasons:
1) The HP Broadcom Mezzanine cards didn’t support Jumbo Frames (only up to ~4196 if I recall correctly) (the HP customer advisory documents on that were released after our tests took place)
2) We were limited to the Lefthand NIC and weren’t fully pushing the 10 GB
What are your specific details on the performance problem?
Gareth
/ August 12, 2010Hi Kenneth,
Our main problem is with Linux performance. For example, we have found that running Windows 2003 on a blade with Flex-10 yields OK performance (though still less than we get from a stand-alone server outside the blade environment connected to same switches and SAN (Cisco Nexus 5020/Compellent). Whereas Linux-based systems yield almost unusable performance.
Some example IOmeter results:
Win2003 (physical host)
4k random reads – single path- 1 worker thread
4551 IOPS
RedHat Ent 5.5 (physical)
4k random reads – single path- 1 worker thread
265 IOPS
As you can see, the IOPS on Linux is bad – even by 1Gb standards. No combinations of drivers/firmware so far have been improved on this.
Did you do your IOmeter test from inside a guest on ESX? If so, was the iSCSI LUN mounted directly to the guest (as opposed to a vmdk)?
jim
/ March 2, 2011I just ran into a similar problem that HP has an advisory for. The fix is to remove any configured DNS servers in the IP config of the VC modules.
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?objectID=c02720395
Dan
/ September 21, 2015I ran into a similar issue where I was attempting to update the FW on the OA, VC, and FC modules. The FW updated well using the HP Update Utility, but for some reason the primary VC module FW was NOT updated. So I attempted multiple things to include resetting, power down and up, etc. But each time I would run the vcsu.exe command to update it would say domain configuration not in sync. Okay, well I did a health check command and the output told me that the domain configuration was in sync. Scratch head …. So I powered the primary virtual connect module down. Waiting about 5 minutes and logged out of the VC as well as the OA. I then logged back into the OA and noticed as I hoped and what should have happened the VC failed over to the secondary VC module. I then ran the vcsu command to update the version and this time it found the only module (primary VC) that had not been updated originally. The HP Utility prompted me if I would like to update the module. I chose YES of course. Now I am working with HP 1/10Gb VC-Enet Module and only updating to supported version of 3.60. Just fyi