I recently migrated a few vSphere environments with external PSC’s to vSphere 7 and noticed some PSC leftovers afterwards. Now a lot is already written about the deprecation of the External Platform Services Controller (PSC) deployment model so I will not go into this.
What I do want to highlight is the importance of the decommission order for the PSC’s and the way you can actually check if there are stale records under the hood.
Lets first start with some key commands:
Which PSC is vCenter pointing to:/usr/lib/vmware-vmafd/bin/vmafd-cli get-ls-location --server-name localhost
Show PSC replication partners (showpartners parameter)/usr/lib/vmware-vmdir/bin/vdcrepadmin -f showpartners -h localhost -u administrator -w <password>
Show all PSC servers (showservers parameter)/usr/lib/vmware-vmdir/bin/vdcrepadmin -f showservers -h localhost -u administrator -w <password>
Now let me elaborate a bit more on one of my recent migrations.
The environment existed of:
- 2x VCSA 6.7 U3 (Enhanced Linked Mode)
- 4x External PSC
- 2x Extenal Load Balancer
Steps taken in the process:
- Upgrade the first VCSA6.7 to VCSA7 and within Phase 2 you select “This is the first vCenter Server in the topology that I want to converge”.
- After successful migration and logon* to the new VCSA7 you will notice that Linked Mode is still active between VCSA7 and VCSA6.7.
(*) In my case the previous PSC’s were added to the AD Domain (Integrated Windows Authentication) and were facilitating SSO. However pre-migration the VCSA6.7 wasn’t added to AD so I manually needed to add the upgraded VCSA7 to AD to re-enable the SSO.(/opt/likewise/bin/domainjoin-cli join <domain> <username> )
- Upgrade the second VCSA6.7 to VCSA7 and within Phase 2 you select “Subsequent vCenter Server” and point out to the first VCSA7.
- After successful migration (needed to add this VCSA7 to AD as well) you will notice the following components within the System Configuration:
- Now it is important to decommission the remaining PSC’s in the correct order using the cmsso-util. I will explain a bit more on this step below so continue reading.
What I initially did was check which PSC’s were replicating with both VCSA7 appliances using the showpartners parameter (described above)
The output was as follows:
VCSA7–01 connected to:
VCSA7-02
PSC67-01
VCSA7-02 connected to:
VCSA7-01
So it is clear that my new VCSA7’s are replicating with each other and that one of them is also linked to one of the old PSC’s. Following the cmsso-util instruction I powered down PSC67-01 and ran the unregister command: cmsso-util unregister --node-pnid PSC67-01 --username administrator@vsphere.local --passwd 'password'
Now the showpartners parameter only shows replication between VCSA7’s. The showservers parameter however still gave me 3 of the 4 old PSC’s in the list.
I contacted VMware support about this as I could not get them removed with the cmsso-util. Apparently this happens quite often and long story short, the pointed me to the following command to get the remaining PSC’s in the list removed: vdcleavefed -h <psc.domain.lan> -u Administrator -w <password>
This worked for me and the showservers parameter now only shows me two servers (the VCSA7’s). All good right? Well not completely.
The following commands can be executed to find out how many component registrations are made under the hood.
vCenter 7:/usr/lib/vmware-lookupsvc/tools/lstool.py list --url https://localhost/lookupservice/sdk --no-check-cert >/tmp/psc.txt
grep -i 'service type:' /tmp/psc.txt |sort |uniq -c
vCenter 6.x:/usr/lib/vmidentity/tools/scripts/lstool.py list --url https://localhost/lookupservice/sdk --no-check-cert >/tmp/psc.txt
grep -i 'service type:' /tmp/psc.txt |sort |uniq -c
The output is similar like this:
5 Service Type: applmgmt
2 Service Type: certificateauthority
5 Service Type: certificatemanagement
etc..
and basically tells me that I’ve got 5 registrations (for most services).
In my case, post-migration, this was: 2x VCSA7 and the 3 remaining external PSC’s , despite the vdcleavefed
command which ran earlier.
Now the good news is that these stale records can be corrected by VMware Support, the bad news is that this will take some time to walk through as you need to go through each service.
Lets go back and rewind as I wanted to know how I can successfully get things migrated without the stale records appearing. What I noticed after the upgrade of both VCSA’s to 7.0 is that the PSC replication somehow got disturbed.
Looking at the topology I found this to be the case after migration:
So I reran the complete migration with one difference, the order of using the cmsso-util.
Steps taken:
- Power off PSC67-04
- From PSC67-03: cmsso-util unregister PSC67-04
- Power off PSC67-03
- From PSC67-01: cmsso-util unregister PSC67-03
- From PSC67-01: cmsso-util unregister PSC67-02
- Power off PSC67-01
- From VCSA7-01: cmsso-util unregister PSC67-01
And while going through this process everything got cleaned up successfully and no stale records appeared.
So what about these stale records? It can well be that during normal operations you will not even know about them and everything works correctly. However, you will run into issues with future upgrades.