One unique incident occurred to me while trying to troubleshoot a MS DTC communication issue between SQL Server instances. A client of ours requested for assistance to fix a distributed query that uses MS DTC. Apparently, communication between the two SQL Server instances is not happening. My usual round of troubleshooting started with a series of network connectivity tests, ranging from PING to TELNET to NETSTAT to whatever is necessary to make sure that communications between the servers are working fine. That led me to look for ways to check for connectivity specifically with MSDTC. One tool from Microsoft is DTCPing, a utility to help troubleshoot MS DTC firewall issues. While I know for a fact that firewall is not an issue in this particular case, I've decided to give it a shot. Running the DTCPing utility on both servers gave me this error message in the log
WARNING:the CID values for both test machines are the same
A quick Google search led me to this blog post and made me think that the servers might have been cloned. Sure enough, when I asked the customers about the history of the servers, they were indeed cloned VMWare images. They didn't use Sysprep to prepare the images after the cloning process, hence, the reason for having the same CID values. There's nothing wrong with VMWare here. It's just the process that's pretty screwed up. What are the chances of two machines having the same GUID values which are supposed to be globally unique across the enterprise? Very slim unless they are inappropriately cloned.
I followed the steps outlined in the blog post to fix the CID values
- Use Add/Remove Windows Components to remove Network DTC.
- Run MSDTC -uninstall in the command-line
- Delete the MSDTC keys in in the registry
HKLM/System/CurrentControlSet/Services/MSDTC
HKEY_CLASSES_ROOT\CID
- Reboot the server
- Run MSDTC -install in the command
- Use Add/Remove Windows Components to add the Network DTC back.
- Restart the Distributed Transaction Coordinator service