Friday, September 30, 2011

Outlook and Global Catalogs

We had an interesting issue recently where Windows clients at a branch office site were getting global catalog services from a domain controller in a remote site. In our environment, we're running Active Directory and at the moment are still at forest level 2003 R2, although over 90% of our DCs are Windows 2008 R2. We have close to 300 sites arranged in your traditional hub-and-spoke replication topology, and we've put two domain controllers at 80% of the sites to ensure that if one DC goes down there is still a DC/GC available locally.

Anyway, at this one site, the users sporadically went for GC services to a hub DC instead of one of their two local ones.

We checked all the usual suspects, including:
  • Was the client in a subnet that was inadvertently mapped to the wrong site, overlapped other mappings or not mapped at all in Active Directory Sites and Services? (That's the most common answer.)
  • Had an over-zealous admin hacked the registry and hardcoded the closest GC setting? Microsoftt KB 319206
  • Were over-zealous network admins blocking ports again? (GC services answer on ports 3268 and for secure LDAP, 3269)
All the normal answers were "no," and it was working most of the time, just not all the time.

So I asked the site to do a network trace from a client the next time they had an issue when they opened Outlook. Bingo. You gotta love traces.

Looking at the trace, it took all of one minute to find the problem. Another minute to fix it.

I filtered the trace looking for DNS queries and the first query I saw a DNS query for LDAP. The query was sent to the local DC.
Good so far.
Except that it returned information about the two local DC/GCs plus the remote DC/GC. hmmm

I opened DNS and drilled down to _tcp.Sitename._sites.gc._msdcs.parent.domain for the site having the issues. Sure enough, there was an _ldap record registered for the remote DC there. That's why, when clients asked for LDAP servers in the site, they had a 1 in 3 change of getting that remote DC.

So I deleted the _ldap._tcp.Sitename._sites.gc._msdcs.parent.domain record from DNS so the remote DC/GC would no longer be offered as a good alternative.

Why did the remote DC/GC get in there in the first place?

We have speculated that it's due to auto-site-coverage and that on 6/6/11 (the date when the remote DC created its _ldap record in that zone) the two DC/GCs in that specific site were having network issues or down or otherwise unable to cover their site. So the remote DC/GC stepped up to the plate and registered to cover the site since there were no other DC/GCs to cover it.

Please note that this is probably not a common issue, but it does happen. It's happened maybe 3 or 4 times to us in about 10 years.

Hope this helps someone....

Wednesday, April 6, 2011

Stuff I Should Know

This is more for home/small office folks who don't have separate IT staff. It's something I should have recognized much earlier in the game, but just didn't.

I've got quite a large network at home that required 8-10 CAT 5 cable connections and three wireless. Today, there are some wireless routers that incorporate 8 (possibly more) ports for wired connections, but I don't have one of those. I have the standard wireless router with 4 wired connection ports and had to supplement that with a second networking device (either another router or a switch--it doesn't matter, I've used both).

Now, being the idiot that I am, I just plugged everything up, did some elementary poking around the cute web pages to set the admin passwords to something private and set up WEP for security on the wireless. Then I thought I was done.

Silly me.

I forgot to look at the DHCP settings--you know, the service that hands out the IP addresses to the devices on your network so everything can talk to everything else. Actually, I did look at the DHCP and it looked good to me and I didn't think another thing of it.

All the devices got IP addresses. Everything could get to the Internet. I figured, well, good. Done.

Except, some devices could get to the wireless printer and others couldn't. My husband bugged the heck out of me, asking why he couldn't print. Some computers would show up in the HomeGroup and others wouldn't. If I shifted around the connections and wired up some of the computers to use the ports on the wireless router, they could use the wireless printer, but other devices on the network using the other switch (or router--my switch just burned up and I had to replace it temporarily with another router until my new switch arrives) couldn't print.

It was very aggravating, especially since I knew "the workaround" was to plug everything that needed access to the wireless devices to the ports on the wireless router. That should have clued me in sooner to the problem.

I looked again at DHCP, first on the switch and then on the wireless router.
The switch was using 192.168.0.1 to 192.168.0.254 with a mask of 255.255.255.0
The wireless router was using 192.168.1.1 to 192.168.1.254 with a mask of 255.255.0.0.

So the devices getting IP addresses in the 192.168.0.0 network were essentially under the impression that devices in the 192.168.1.0 were in a different network. An unroutable network.

I changed the switch's IP to 192.168.1.x and corrected the mask, and then I split up the IPs to give DHCP on the switch the lower range to hand out while the wireless router got to keep the upper range of IP addresses to hand out so there would be no IP address conflicts.

Now, we can all print and we all show up in the HomeGroup network.
This is...like...networking 101 but it shows how easy it is to overlook the basics.