Tuesday, September 29, 2009

RPC Server Unavailable and DNS

Resolving One Weird Case of RPC Server Unavailable

I've been battling other problems (like lingering objects when we've never done a restore from a backup tape or had any of our DCs offline for more than a couple of days) when I ran across one of those annoying "RPC Server Unavailable" messages from our replication monitoring.

I was already working with Microsoft on the whole "where are these lingering objects coming from" thing, so I dragged them into the muck with me.

I thought our DNS architecture was beautiful and bullet-proof. I was wrong.
Note: I've very much simplified our "architecture" in this example. In reality, we have 36 domains and around 582 domain controllers. (You don't want to know the real and extremely complex design.)

Symptoms
So we got the message "RPC Server unavailable" when DC2 tried to suck data from DC3. DC3 could suck data just fine from DC2.

Both DC2 and DC3 are members of an AD domain called weird.test.org. The DNS zone for weird.test.org is AD-integrated. There is one tree in the forest of test.org. All the DNS zones, in fact, are AD-integrated, and all the DCs also run DNS. All DNS servers are DCs.

DC1 is in test.org, the root domain of the AD tree. DC1 carries the test.org DNS zone and the DNS delegation for the weird.test.org DNS zone, which is delegated down to DC2 and DC3.

So you have:
DC1.test.org, carries the test.org DNS zone and delegation for weird.test.org
  • The _msdcs.test.org DNS zone also lives on DC1.
  • DC1 points to itself for DNS resolution.
  • DNS is configured to forward to some Internet gateway DNS servers for any zones it does not carry.
  • weird.test.org is delegated to DC2.weird.test.org and DC3.weird.test.org.
DC2.weird.test.org, carries weird.test.org DNS zone, and also carries the root domain zone: test.org
  • DC2 points to DC1 for DNS resolution.
  • DNS is configured to forward to DC1 for any zones it does not carry.
DC3.weird.test.org, , carries weird.test.org DNS zone, and also carries the root domain zone: test.org
  • DC3 points to DC2 for DNS resolution.
  • DNS is configured to forward to DC2 for any zones it does not carry.

At one point, DC3 had been on old hardware in a different site. We demoted and booted out the old DC3 and brought up a new DC3 on new hardware with a different IP address. This was several months before this incident.

Testing
When we pinged DC2 from DC1, it pinged fine using the fully qualified domain name, e.g. DC2.weird.test.org. But when we pinged the GUID _msdcs record, e.g. abe33cdf-edd2-3456-abc1-acbd12398745._msdcs.test.org, we got a the wrong IP. WTF?

Troubleshooting
I went through every DNS server, every DNS server on the delegation for DC2's zone, weird.test.org, and checked the host A records for DC2 on every DNS server with a copy of the zone.

All the DNS servers had the right data. The Host A records were correct. The PTR were correct. There were no duplicate Host A records. There were no duplicate PTR.

We cleared all caches (cleared the DNS server cache and did an ipconfig /flushDNS).

We did a network trace to track who was the culprit with the bad IP information for DC3.weird.test.org. The trace showed that the name resolution process never left DC1. Even though DC1 was not authoritative for the weird.test.org zone and therefore did not have the DC3.weird.test.org Host A record to be able to come up with an IP for DC3.

Weird Discovery

Because DC2 uses DC1 for DNS resolution, it asked DC1 for resolution of abe33cdf-edd2-3456-abc1-acbd12398745._msdcs.test.org.

The _msdcs.test.org DNS zone contains cname records for the domain controllers (DCs), so abe33cdf-edd2-3456-abc1-acbd12398745._msdcs.test.org is a CNAME that resolves to: DC3.weird.test.org.

DC1 therefore looked for a delegation to weird.test.org.
DC1 found the delegation to weird.test.org. But it also found the name it was looking for on the delegation.

"A-ha!" DC1 said. "I don't need to actually query any of the authoritative DNS servers listed on the delegation. I have the IP right here on the delegation itself."

So DC1 returned the IP listed on the delegation rather than asking DC2 or DC3 (the two DCs authoritative for the zone).

And unfortunately, because we had replaced DC3 and changed the IP at that time, this information on the delegation never got updated.

Moral of the Story
It should have occurred to me that if DC1 needs to resolve the IP for a DC/DNS server listed on a delegation, it will just use the IP on the delegation and not actually send a request for resolution to the DC/DNS servers authoritative for the zone. It makes sense that it does that--why forward a query to the DC when you already have the IP? It was just unexpected.

I never thought to look at the delegation to find the bad IP DC1 was handing out.

And it also embarrassed me, because I thought we had corrected that IP when we replaced DC3.

Would-a; could-a; should-a.
Oh, well. One more learning experience.

Tuesday, September 1, 2009

Tricked-out eeePC




Tricked-out eeePC 901

Insane or not, I recently spent a fun-filled weekend maximizing the potential of my ASUS eeePC 901--or netbook--if you like that term. What did I do?
First, I bought some really cool skins (aka decals) as you see. They seemed so apropos since in my admittedly scarce spare time, I also write mysteries. My eeePC is resting on top of my "normal" Dell laptop so you can see the size comparison.
Well, let's see, what else did I do:

Hardware Upgrades
  • Memory upgrade: sTs Electronics ASUS eeePC 901 2GB RAM memory upgrade replaced the whimpy 1GB the system came with.
  • Solid-State Drive Upgrade: 32GB SaberTooth-SS SATA Mini PCIe SSD for 901 eeePC upgrade to replace the tiny 16GB SSD installed in the system (the system says it has 20GB, but it's divided between a 4GB and 16GB SSD, so I replaced the 16GB with the 32GB). I would have gone for 64GB, but it was beyond my means at the moment.

  • Additional Storage: Transcend 16GB SDHC Class 6 Flash Memory card with card Reader. (I had evil designs for this extra bit of storage).
  • Additional Storage #2: I already had an extra 1GB USB stick lying around...
Method (to my madness)
Once I got all my bits and pieces (or is that bytes and pieces? ;-) ) I...
  1. Updated the BIOS--this is a really important step because the system wouldn't recognize the new SSD until I did this.
  2. Installed all the new hardware.
  3. Followed the directions at the blog Install Windows XP on the Asus EEEPC to install Windows XP on the 1GB USB stick and make it bootable from my main desktop PC.
  4. Turned on the system and went into the BIOS (press F2 as the system boots) and told it to boot to the 1GB USB stick.
  5. Booted to the 1GB USB stick and initiated the installation of Windows XP
  6. Installed XP on the 4GB partition, overlaying the old OS
  7. Created a D: drive out of the 32GB SSD and formatted it for NTFS
  8. Copied the source code for Windows 7 to the Transcend 16GB Flash memory card (formatted for NTFS) from my desktop PC.
  9. Stuck the 16GB Flash memory card into the eeePC and initiated the Windows 7 install.
  10. Installed Windows 7 to the D: drive, leaving XP on the 4GB partition (C:) as a failsafe in case I ever get a bunch of money and can replace the 32GB SSD with a 64GB SSD :-). This makes it easier for future upgrades.
  11. Whoopee! I now have Windows 7 (with Aero glass) running on my eeePC and I'm happy as a clam.


Windows 7 Running on an eeePC 901


And the lovely, Aero-Glass Windows 7 desktop. Yep. It works.

Wednesday, August 26, 2009

Overcoming the Stupidity of Others

Here’s one for you. I spent a good twenty minutes on this.

Setting the Scene
We’ve been trying to prepare our AD forest for the deployment of Windows 2008, so of course, we’ve run the schema extensions, etc. We got to the domain prep steps and low-and-behold, /gpprep failed in 10 of our domains because some admins, in their infinite wisdom, did wild and crazy things with permissions.
So I’m having to fix the permissions 240 GPO’s in order to complete our prep work.

GPO Clunker
One domain admin, in a fit of brilliance, set DENY ACLS on some GPO’s for NT AUTHORITY\Authenticated Users. Isn’t that nice? The net effect is a clunker GPO that shows up in adsiedit.msc as a little notepad icon instead of the normal folder icon.
And I couldn’t get access to it to fix it through normal means. Fun, fun, fun.
The Fix
The real fix is to remove this joker admin from Domain Admins. However, that isn’t always practical. So all I could do was this:

1) Find the Problem: First, find the problem. Because it isn’t always a DENY for NT AUTHORYT\Authenticated Users. Sometimes other default permissions are diddled and/or removed that prevent things like replication from happening. I have another joker remove SYSTEM and Enterprise domain Controller permission, causing inconsistencies in SYSVOL, etc, because the GPO couldn’t be replicated. There are about 5 default permissions that I hate to see people diddle with, including those for Domain Admins, SYSTEM, Enterprise Domain controllers, etc.

Anyway, to find the problem, dump the ACLS using dsacls and piping to a file, for example:

dsacls “CN={12345678-1234-1234-123456789012},CN=Policies,CN=System,DC=mydom,DC=myorg,DC=top” > badgpo.txt

(Of course, you will subtitute your troublesome GPO's GUID in place of my phoney-baloney one, above.)

2) Review badgpo.txt: Review the contents of the output file to look for any DENY and/or missing ACLS that prevent you from managing the GPO.


3) Reset the ACLS: Once you figure out what’s wrong, then you’ll have to craft a dsacls to put back what is missing/wrong. In my case, where the admin had put in a DENY for NT AUTHORYT\Authenticated Users, I had to remove the DENY. All the other permissions were ok. But because DENY permissions are always evaluated first, even an enterprise admin like moi could not do squat with the GPO.

Here are the steps to fix this particular situation:
a. Log on to the PDC-emulator role holder using an account in Domain Admins
b. Run the following DSACLS (inserting your own GPO GUID info, of course):

dsacls “CN={12345678-1234-1234-123456789012},CN=Policies,CN=System,DC=mydom,DC=myorg,DC=top” /R “NT AUTHORITY\Authenticated Users”

This command will remove the permissions for Authenticated Users.

But wait! You might say, won’t you then lose the ability to mess with the GPO? No. Because I checked the DSACLS output before I did this and saw that Domain Admins still had permissions—it’s just that the permissions were unusable because of the DENY on Authenticated Users.

So much for that problem.

Oops--maybe not.

There are some things you should know about this, because the situation is still fraught with gotchas.

You may or may not know that you should never set permissions outside of the Group Policy Management console (GPMC) on group policy objects because GPOs have several components split between the NTFRS file system under SYSVOL and those maintained in the Systems container in AD. Those two sets of permissions must be kept in sync. The GPMC does that.

But wait! you might say, didn't you just set them in AD only? Yes--but I had to. And now comes the worst task. You have to go back into the GPMC, select the GPO you just mangled and select Edit. If luck favors you, the GPMC will spy the mangling and pop up a box indicating that it found different sets of permissions on the pieces in SYSVOL versus those in AD. It will ask Do you want me to correct this?

The appropriate response is: Yes. We very much want GPMC to fix this. Thank you very much.

Then, you can take any final steps necessary to ensure the permissions on the policy are correct. And if you take my advice, you will ensure all your GPOs have the following set of default permissions always left intact. It will reduce your future pain and suffering 10-fold.

Not only is this the default permission set, but I view the permissions as a failsafe set that will (hopefully) let me bludgeon my way into a GPO if I have to, after an anal admin has messed it up. Yes, there are tons of arguements about removing these permissions, but frankly, I'm not interested. These are a failsafe and the defaults.

Authenticated Users -- Read, Apply (You can remove the Apply if you want to control who/what gets the GPO via granting another security group the Apply permission. But please leave Read. This is a failsafe and for your sanity and protection.)

SYSTEM -- Read, Write, Create all child objects, Delete all child objects

Enterprise Domain Controllers -- Read

Domain Admins -- Read, Write, Create all child objects, Delete all child objects

Enterprise Admins -- Read, Write, Create all child objects, Delete all child objects

Believe it or not, I actually had one yahoo remove System and Enterprise Domain Controllers, Domain Admins and Enterprise Admins permissions. It wrecked havoc with NTFRS replication and was nearly impossible to fix. The only thing left with any permissions was his ID. All I could do was reset the password on his ID to hijack it and then go in and reset these defaults.

Bad thing to do, security wise? Yes. And it was fortunate (in one respect) that he was still with our organization. If he had been gone and his ID deleted already...well...that would have been mighty interesting.

That's why you need the default permissions--they are a failsafe. You don't want to get caught with "junk" on your domain controllers and in AD that can't replicate and causes inconsistencies in SYSVOL (among other things).

So...that's the end of this post!

Wednesday, June 17, 2009

VBScripts to Know and Love

I get asked for one thing at least three or four times a week: Can you give me a report on my users' last logon?

Why yes, I can, if you don't mind +-14 days accuracy. Generally, that's good enough, so you can run a report using the user's lastlogontimestamp. That value is replicated, so we can use it. For more details about this attribute, see the Directory Services Team blog .

Given this bright, shiny lastlogontimestamp, it is relatively easy to write a script to extract the information and squirt it out into a text file (tab delimited so you can feed it into Excel nicely). There are some weird things you have to do to the attribute, though, to get nice output. That's why I'm including the vbscript.

There is also a weird thing that you have to do if you want to include the user's Description attribute in your report.

So here it is.
You run it from the command prompt, i.e. cscript userlastlogon.vbs
It pops up three dialog boxes.

BOX 1 - Enter the OU/domain info, as you would for any LDAP query, e.g.:
OU=users,DC=mydomain,DC=com

BOX 2 - Enter the FQDN for the domain controller you want to use, e.g.:
DomControl1.mydomain.com

BOX 3 - Enter the name of the output file, e.g.:
userlogons.txt

It will then spew out the sAMAccountName of the accounts as it processes them, to give you warm fuzzies that it is doing something while it creates the output file you asked for, e.g. userlogons.txt.

The output file will include: name, sAMAccountName, lastLogonTimeStamp, userAccountControl, mail, and description.

Note two things:
  1. the Description attribute is a multi-valued attribute, so it's handled as an array.
  2. the LastLogonTimeStamp has to be manipulated to give you a human-readable date.
Enough said. Here is the code for your viewing pleasure:

Const ADS_SCOPE_SUBTREE = 5

'Get domain and DC info
strdomain = INPUTBOX("Please enter the domain context, e.g. DC=mydomain,DC=com: ")
strDC = INPUTBOX("Please enter the FQDN for the DC to connect to: ")
strFile = INPUTBOX("Please enter the output file name: ")

If strdomain="" THEN wscript.quit
If strDC="" THEN Wscript.quit

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFile = objFSO.CreateTextFile(strFile, True)
Set objConnection = CreateObject("ADODB.Connection")
Set objCommand = CreateObject("ADODB.Command")
objConnection.Provider = "ADsDSOObject"
objConnection.Open "Active Directory Provider"
Set objCOmmand.ActiveConnection = objConnection

strCommand = _
"Select name, sAMAccountName, lastLogonTimeStamp, userAccountControl, mail, description from 'LDAP://" & strDC & "/" & strdomain &"' where userAccountcontrol = '512'"

objCommand.CommandText = strCommandobjCommand.Properties("Page Size") = 1000
objCommand.Properties("Timeout") = 100
objCommand.Properties("Searchscope") = ADS_SCOPE_SUBTREE
objCommand.Properties("Cache Results") = False
Set objRecordSet = objCommand.Execute
objRecordSet.MoveFirst

Do Until objRecordSet.EOF
usrSam = objRecordSet.Fields("sAMAccountName").Value

usrn = objRecordSet.Fields("Name").Value

If objRecordSet.Fields("mail").Value <> NULL THEN
usrmail = "No Mail"
ELSE
usrmail = objRecordSet.Fields("mail").Value
END IF

strDesc = objRecordSet.Fields("description").Value
If IsArray(strDesc) then
for i=0 to Ubound(strDesc)
usrdesc = usrdesc & " " & strDesc(i)
next
ELSE
usrdesc = "No Description"
END IF

wscript.echo usrSam

On Error Resume Next

set usrLastLogonTS = objRecordSet.Fields("lastLogonTimeStamp").Value
If Err<>0 THEN
objTextFile.WriteLine usrn & " " & usrsam & " No Date"
ELSE
intLastLogonTime = usrLastLogonTS.HighPart * (2^32) + usrLastLogonTS.LowPart
intLastLogonTime = intLastLogonTime / (60 * 10000000)
intLastLogonTime = intLastLogonTime / 1440

objTextFile.WriteLine usrn & vbtab & usrsam & vbtab & usrmail & vbtab & usrdesc & vbtab & intlastlogontime + #1/1/1601#
END IF
usrdesc = ""
objRecordSet.MoveNext
Loop

Wednesday, June 3, 2009

Zombie DNS PTR Records

Zombie DNS PTR Records: Or, The PTR Records That Will Not Die
Do you ever have this problem in your reverse, AD-integrated DNS zones?
  • You discover an old manual PTR record that conflicts with a new dynamic registration and you delete it
  • It comes back

Sure, you can also have that problem even if it doesn't conflict with a new dynamic registration--it's just that you don't normally find this issue until you're looking to see why you're having issues in the first place. And I won't go into all the problems you may have from duplicate PTR registrations.

Anyway...what do you do?

Everytime you delete it, the stupid PTR just comes back.

Here's the trick I use. Let's say the problem record is PTR 10.20.30.40;
i.e. 40.30.20.10.in-addr.arpa.

  • Go into the DNS MMC and drill down to the PTR record.
  • Select the offending record, 40.30.20.10.in-addr.arpa and right-click, delete it
  • Select the parent zone, in this case 30; as in 30.20.10.in-addr.arpa
  • Right click on the parent zone and select to create a New Domain... from the pop-up menu
  • The name of the new domain is the name of the offending record, i.e. 40
  • Let this change replicate
  • Delete the zone, i.e. right-click on 40 and delete it

That will permanently delete the offending PTR record and keep it from appearing again.

Hope this helps someone!

Amy

Thursday, May 28, 2009

AD-Integrated DNS Reverse Zones

I recently had reason to try to identify manual PTR records in DNS because of conflicts with new, dynamic registrations. While I try to discourage folks from manually "helping" (i.e. tinkering) DNS, they often do it anyway, and it can cause problems. If you enter manual records, they are just that: manual. The system won't overwrite or scavenge them, so you are signing up to manage DNS, at least for those records, instead of letting DNS manage itself.

Anyway, when I started poking around, I realized the data was stored a little differently than I expected.

So let's take one example. I noticed we had four records for the same IP, 10.10.20.51. Three of the records were manual and 1 was dynamic. The dynamic one was the only "accurate" one. The rest were human-generated mistakes.

Note: For folks wondering about this, we had some inexperienced (?incompetant?) DNS admins who instantiated DNS zones without the proper delegations. I had to come back later and find all the duplicate DNS zones, ram them together because folks couldn't identify "good" records they needed to keep, create the proper delegations, and then shove the existing records back in. Unfortunately, this process creates a bunch of manual records. Sigh. But it was either that or lose all the records and no one would allow me to do that.

So, how do DNS zones look?

In DNS MMC, I see 4 records for 10.10.20.51, see below (sorry about the fuzziness).
If I use DNSCMD and /zoneexport I get the one record as show below, with four values. Only one of them is dynamic (the one marked [AGE:]). The dynamic record is the only good one:
51.20 900 PTR test1.test.com.
[AGE:3579987] 900 PTR test2.test.com.
900 PTR test3.
900 PTR test4.

In ADSIEDIT.MSC, it stores the records similar to what you see in the zone export, where there is one record for DC=51.20 in DC=10.10.in-addr.arpa, where the 4 records are the values for the dnsRecord attribute:




The octal data stored in the dnsRecord attribute is the “name” you see on the DNS PTR “record”, but it is stored as a multi-valued attribute. The creation date is the date the *first* record was stored/created. So it depends on if the dynamic was first, or one of the manual records was first.
This information is more in the form of "interesting trivia", but it does help to understand what is going on.
Sincerely,
Amy

Monday, May 11, 2009

That admincount, AdminSdHolder and SDProp thing

I've posted this before, but....

For all you desperate admins out there searching the Internet for help on why some of your users’ Blackberry’s aren’t working…or why some of your users can’t publish their certificates to the GAL…or why some of the permissions you’ve delegated in Active Directory don’t seem to work all the time…or other anomalies like that…

This blog is for you.

Why Things Just Don’t Work As Expected
Having problems with permissions in AD? Problems with Blackberries? Weird anomalies when trying to reset passwords?

I thought so.

Sometimes being an enterprise administrator for a very large organization is just…depressing.

Since we’ve upgraded to Windows 2003, I’ve learned that Microsoft is serious about best practices such as not using your standard, mail-enabled/mailbox-owning user account for administrative tasks. And I’ve grown serious about it, too since I’ve begun to appreciate security. Now, I’m so serious that I’m thinking fondly of starting the Guido School of Admin Training (with sincere apologies to any existing Guidos or schools already named thusly).

Our Guido School of Admin Training is a very informal school held outside in the nice, fresh air in the alley between two office buildings. There are no formal registration procedures, however you do have to be nominated to attend.

What to expect: The instructor, Guido, will shake your hand and then gently haul you over to the nearest wall by your collar. Then it becomes really exciting and lots of fun. With a jaunty smile, Guido grabs you securely by the back of the neck and smacks your face against the wall while saying in a firm tone of voice:

Do {smack} not {smack} nest {smack} security {smack} groups {smack} into {smack} protected {smack} groups.

And then, after a slight pause for refreshments:

Do {smack} not {smack} place {smack} standard {smack} mail-enabled/mail-box associated {smack} user {smack} accounts {smack} into {smack} protected {smack} groups.

Now, if you can repeat what you just learned, you receive a diploma and a short ride to the nearest hospital.

Our school guarantees success. For the rest of their life, students will remember what they’ve been taught, even if they imprinted on Windows NT and never learned any other operating system and refuse to use an administrative account because it’s just too much trouble. Even if they repeatedly put a group containing all 8,000 users in their domain (within your AD tree, mind you) into Domain Admins because they thought that would be the simplest way to deploy something, say, a patch. And after they remove the group containing all the users from Domain Admins and they suddenly get a rash of calls about how all the managers’ Blackberrys are all broken, they call you to fix it.

If this happens, it’s time to send the admin to Guido’s school, so just take the bull by the horns and nominate them.

Because now you have to deal with the admincount attribute.

You see, protected groups are special. They need to be special because in the past, fanatical admins have removed critical permissions from key groups instead of simply not putting unnecessary admin accounts into those groups to begin with. After removing critical permissions, these fanatics have lost control of their domain, or worse, their AD Tree or even Forest. Okay, maybe that’s not the entire reason for protecting these key groups, but it is certainly one good reason. There are a lot of others and this blog isn’t long enough for all of them.

So in Windows 2003, Microsoft helped admins avoid such idiocy by developing a mechanism to put back critical permissions on certain key groups called protected groups. Organizations which follow best practices for administration and security are most likely completely unaware of this secret mechanism and don’t need to worry about it. If you follow best practices, it will have no impact on you and never will. You deserve to be worshiped as the deity you obviously are.

For the rest of us, we need to learn this rule: You don’t put standard user accounts associated with a mailbox into any protected group; and you don’t nest groups into protected groups (because you lose track of what you nested and could potentially nest groups containing standard user accounts into protected groups and elevate permissions that should not be elevated).

If you violate this rule, the admincount attribute will afflict you mightly. (And yes, I used “afflict” on purpose so don’t leave me a lot of obnoxious comments about that. It’s funny. Laugh, darn you.)

Protected Groups
What are the protected groups? With Windows 2003, they include:
Schema Admins
Enterprise Admins
Cert Publishers
Domain Admins
Account Operators
Print Operators
Administrators (domain local)
Server Operators
Backup Operators

How are they protected?
One piece of the mechanism protecting these special groups is an attribute called the admincount.
  • The protected groups have the admincount attribute set to 1.
  • Any group or user account nested into a protected group gets the admincount set to 1.
  • Any user nested into a group nested into a protected group gets the admincount set to 1.
The only exception is if there is a user account or group from another domain in your AD tree nested into a protected group, like the domain local Administrators group. The group/accounts from the other domain will not be affected. (But that does not mean you should be a knucklehead and use your standard, mailbox-associated user account for administrative purposes in other domains. Come on, grow up.)

Normally, the admincount is not set at all (it is null).

Every fifteen minutes or so, the operating system looks for the admincount attribute. If it finds it, it does some interesting things. It removes inheritance from the object so it will not inherit the permissions it might once have inherited from its parent OU or OU structure. This prevents unfortunate things from happening if you move one of the protected groups out of the Builtin container to a different OU or container where you might have diddled with the permissions it inherits might cause a breach of security or worse. Like a Help Desk person with full control over a user OU suddenly gaining full control over Domain Admins because some knuckle-dragging yahoo put Domain Admins into that OU. Stuff like that.

In addition, if an object like a user account has the admincount attribute set, the system strips certain key permissions granted to that account or group by the schema definition for that object type at the moment of creation. Specifically, most of the SELF permissions. These are the permissions that let a user account, for example, publish a certificate to AD (and hence, to the global address list or GAL if you are running a product like Exchange 2003).

So this could have the impact of preventing a user from publishing a cert.

Or it may break a user’s Blackberry if the user happens to be put into a group like Domain Admins, because generally the Blackberry’s service account is granted permissions through the OU hierarchy. These permissions for the BB service account are necessary for the service account to send/receive mail to/from the user’s Blackberry to their Exchange mailbox. So if the user’s account is affected by the admincount attribute, it will stop inheriting permissions from the OU, the BB service account will not get the permissions it needs, and this user will have a brick instead of a Blackberry.

Just a few examples. I’ve also seen it cause apparent permission anomalies where an admin grumbles that AD is broken or doesn’t work well because sometimes they can’t manage an object they supposedly have permissions to manage. When I get that type of call, the first thing I check is if the admincount attribute is set on either the admin’s account or on the object they are trying to manage. Most of the time, one or the other (or both) has the attribute set and it’s time to nominate the admin for my special school.

Because if a user object can’t inherit permissions and has some self-permissions removed, how well do you think you’re going to be able to manage an account with delegated permissions? How well will an account thus damaged work with inherited and/or delegated permissions? Not well, my friend. Not well at all.

The Kicker
Once a user account ( or group) is “contaminated” by the admincount attribute, it doesn’t come clean just by removing the user account or group from the protected group. Oh, no. You have to fix it.

The nice thing about this is that you can always tell when some admin really needs to meet Guido out in the alley. You can just run ldifde and export a file containing all the users and groups with the admincount attribute set to 1. And of course the event logs on your domain controllers will show membership changes to protected groups so if you catch the problem soon after it occurs, you can see exactly who needs that training.

Here is an example of the ldifde command that will provide you with a text file called ac.txt of all users and groups with the admincount set in a domain (in the example, the domain is example.com):

Ldifde –f ac.txt –d dc=example,dc=com –l samaccountname –r “(admincount=1)”

Note: I like to include the samaccountname attribute in the output (it will by default include the distinguished name) because it helps me in other processes—but this is entirely optional.

You might think that the admincount is a bad thing. It is not. It is your friend because it shows when you are not following best practices and are, in fact, endangering the security of your enterprise. So I’m not in favor of turning off this functionality.

I’m in favor of training and implementing best practices that create a reasonable security model based on the use of a separate account for administration. This secondary account should not be mailbox enabled or associated with a mailbox.

Fixing It
Here is how to fix it if things weren’t done exactly right in the past. I will warn you, however, that if you have a lot of user accounts affected (as I have had to fix multiple times for multiple domain admins) this can be a somewhat time consuming process, even if you script it up.

Fix Protected Group Membership
First, you must make sure you undo the cause, which is to say, make sure you don’t have any groups nested into protected groups. If you do, remove them. There is no point in trying to cleanse things if you leave the source of contamination. Leaving groups nested into the protected groups means at some future date, you will be addressing this same problem again because people forget and put the wrong accounts into “innocent looking” groups.

So remove all nested groups and remove all standard user accounts that are associated with a mailbox. Just leave your agreed-upon administrative accounts which can and should have the admincount attribute set. Naturally, you won’t be an idiot and remove all the accounts before you add the new administrative accounts, because you might find yourself suddenly unable to actually add the new admin accounts, particularly if you empty out Domain Admins without putting any new accounts into it, first. (Just an FYI. I know you know these things but I like to state the obvious. It’s so satisfyingly…obvious.)

Clean the Groups and Users Removed
Once you have removed any nested groups and innocent user accounts, you can clean them. You have to clean both the user accounts and the nested groups, if any.

First, reset the admincount attribute to 0 (or null) on the nested groups and users. Null is best, but 0 will work and sometimes it is nice to set it to 0 because you then have a historical artifact you can search for later if you have other issues. A 0 will tell you that this user or group was afflicted by the admincount at one point. (Just as you are afflicted by whichever admin did this to the user or group.)

For one or two users and groups, you can simply edit them in adsiedit.msc, which will allow you to reset the admincount attribute. You can also script it up if you wish. (I use a script for bulk cleaning.)

If you are using adsiedit.msc, you should take the following steps:

  • Right click the user (or group) and select Properties.
  • On the Attribute Editor tab, find the admincount attribute. Select it and click the [Edit] button. Click on the [Clear] button (or set the value to 0 if you want the historical artifact). Click [Ok].
  • Select the Security tab
  • Click on the [Advanced] button. Click on the [Default] button. This will restore the removed permissions PLUS it will put a check mark next to the “Allow inheritable permissions…” box, which you want.
  • Click on [Ok] until you close out that user’s properties.

Unfortunately, as you see, in addition to clearing the admincount, you have to reset (turn on) inheritance for that object (group or user). Finally, you must give it back the permissions that object normally gets when it is first created. These permissions are not inherited, they are defined in the schema for that object and are granted to the object when it is created. If you use certificates, you’re going to want these permissions and that’s why the [Default] button is so handy. It restores all those things for you.


DSACLS can also be used to restore inheritance and reset an object back to its default “state” by using the /P:N /S switches.

There is obviously a lot more to be said about this, including administrative practices, best practices, and security whys & wherefores, but I’ll be here all day if I don’t stop somewhere.
There is a relevant KB article, “Delegated permissions are not available and inheritance is automatically disabled” KB817433, but I don’t recommend doing the workaround. In fact, I generally don’t like referencing that KB article because it includes that workaround and some knuckleheads always want to do the workaround instead of just tackling the problem and fixing it properly.

I recommend doing things the right way so you don’t have to deactivate something that is there to help you and prevent you from doing something really egregiously stupid that could cost you the control of your domain or AD forest.

So…enough already.

Good night and sweet dreams.
--Amy

Wednesday, January 14, 2009

Netdom /renamecomputer returns an RPC Error

Short blog on an annoying issue.

I dont know about you, but I keep forgetting that error messages are not always indicative of the real problem.

Heres an issue one of our site admins had recently.  He was trying to use netdom /renamecomputer to rename a bunch of workstations in his domain.  It worked on a few computers, but not all of them.  They recently starting deploying a bunch of newly ghosted workstations.

From that description, you can guess about all the rabbit-trails I followed before I realized what was happening.  The solution was too simple.

Before I figured it out, I did the usual testing:

·       Checked the secure channels between the workstations and the domain

·       Verified forward and reverse (A and PTR) DNS records for the workstation and made sure there were no other duplicate registrations hanging around with the wrong IP (or old PTR records, for that matter).

·       Even checked for duplicate SPN, since we do have the occasional KDC 11 event

·       Even checked WINS

·       Looked at the computer objects in AD and verified they were present and had the expected information such as good modified dates, etc.

·       Checked for duplicate SIDs, thinking some systems may not have firmly joined the domain (which would correct the SIDs on ghosted systems that were not properly sysprepped, first).

Everything looked good.  In all other ways, the systems were on the domain and completely functional.  Their event logs were clean.  The logs on the DCs were clean.  There were logon events for the computers, as appropriate.  But when you tried to rename them with netdom, you got an RPC error.

So I compared the services on a workstation that was okay with netdom and a workstation that was not.

The culprit?  BlackIce.

Should have looked there first.  Dont know why I didnt.  Guess I was looking for a network/configuration/AD issue, since this was presented to me as an issue with the domain controllers in their domain.

Anyway, as soon as they disabled BlackIce, they could do the netdom /renamecomputer and all was well.  I had them modify their script to use SC to shutdown & disable BlackIce before running netdom, and then using SC to enable and start BlackIce again.  No problem.

Just another case where a simple answer is the right answer.  And I thought I should blog about this since I googled the topic earlier and found a lot of people asking, but no one got any sort of a helpful answer.

Not that this is all that helpful, but I do my best.

Sincerely,
Amy