This is quite an interesting post because on occasion you get a problem where the investigation has to go quite broad, but quite deeper - These are the investigations I find the most satisfying when you find not only the workaround, but the longer-term fix.
Background situational awareness
Firstly, we need a bit of background, but obviously on a public forum. I can’t go into too much detail, but on this particular day, the following needs to be considered.:
- A CRL for a certificate authority had expired
- The Domain controllers received their January and February patches
- Application on the server uses a service account
- Majority of the service were off-line, but one of them was still working?
Situational awareness summary
Those particular key facts lead you down to particular avenues one could this have something to do with certification and an invalid or expired CRL or could this have something to do with the domain control of patching that wasn’t obvious immediately in the windows update article about what’s being addressed in that hot fix (This point is only valid if you read the update before installing them)
The last point is an interesting one, Obviously, if you have Ones of the operational and others off-line your first port of call is to start doing a comparison with what’s different And what discrepancies you can see between the working server and the nonworking server - This is usually a good place to start, but not in this particular situation - That I did additional troubleshooting and steps that really did not need to be completed the delayed the point of which we got to a workaround to get the service back online.
Workaround and root cause
No, these are absolutely not the same thing a work around is something you do to restore service which may or may not be temporary depending on the sticky plaster that’s been used.
The root cause analysis actually fixes the problem, so it will never come back, however, the problem occurs is when you have a workaround sold as a root cause - should be a very dangerous with what you think are permanent fixes being lots of temporary little fixes all being missed sold as per permanent fixes.
What behavior did we see?
The oil server that hosted an application was unable to authenticate clients, obviously except for the elusive working one, but we will cover that later, when users were trying to use the application, they were getting an error, The error you will get at the user side of this analysis will not be helpful to figure out what’s going on - It will be some vague reference to something that is not applicable to the problem.
The take away from this is useless reporting an error with the application, We now need to focus our efforts on the IIS server.
Server Side : The Error
This was the error driving the whole problem as shown below this was logged in the System Event log under the Event ID 7 and the type Security-Kerberos:
The digitally signed Privilege Attribute Certificate (PAC) that contains the authorization information for client app.user in realm bear.local could not be validated. - This error is usually caused by domain trust failures; Contact your system administrator
Server Side Operations : Troubleshooting not fixing....
When I started this I was using the known issues in the back of my mind, so first I though can I clear the CRL cache and would that make a difference, well this is the command to do that:
certutil -urlcache * delete
During Kerberos Network Ticket Logon, the service ticket for Account srv_iis_app1 from Domain bear.local could not be forwarded to a Domain Controller to service the request. For more information, please visit https://go.microsoft.com/fwlink/?linkid=2262558.
Note : Microsoft did start this process in April 2024, and you are in this situation as you have probably chosen the "take no action" option, even though the article states action is required.
Updates released in or after January 2025 will move all Windows domain controllers and clients in the environment to Enforced mode. This mode will enforce secure behavior by default. Existing registry key settings that have been previously set will override this default behavior change, The default Enforced mode settings can be overridden by an Administrator to revert to Compatibility mode
Well this is very relevant as we are now in February 2025 (at the time of this article) and the January 2025 patches have indeed been installed on the Domain Controllers this week, which would indicate that it is not the application server at fault but rather the Domain Controller is rejecting the requests during to this enforcement.
The same article then goes on to say:
The Windows security updates released in or after April 2025, will remove support for the registry subkeys PacSignatureValidationLevel and CrossDomainFilteringLevel and enforce the new secure behavior. There will be no support for Compatibility mode after installing the April 2025 update.
Which means if nothing is done this will fail permanently after the April 2025 hotfixes are installed, which means this cannot be a workaround, but we need to find a fix.
Domain Controller Patches
This does not mean you can remove this hotfix that is a bad idea, you need your Domain Controllers to be patched, having unpatched Domain Controllers is a very silly idea, this is also not a logical action to take, as that would mean never patching your Domain Controllers again.
Event Log Errors:
The errors are also coded based on the "Warning" and "Error" so for example this event will mean that the server (not the Domain Controller) is still not compliant but requests are being accepted:
Where as the same event with a "error" means these error are being declined, this is the error you get when the error causes the application to fail:
That is also outlined in the Microsoft article here:
This event is shown as a warning if PacSignatureValidationLevel AND CrossDomainFilteringLevel are not set to Enforce or stricter. When logged as a warning, the event indicates that the Network Ticket Logon flows contacted a domain controller or equivalent device that did not understand the new mechanism. The authentication was allowed to fallback to previous behavior.This event shows as an error if PacSignatureValidationLevel OR CrossDomainFilteringLevel is set to Enforce or stricter. This event as “error” indicates that the Network Ticket Logon flow contacted a domain controller or equivalent device that did not understand the new mechanism. The authentication was denied, and could not fallback to previous behavior.
Workaround : Registry Key to the "temporary" rescue
We need to get service back online so from an Administrator command prompt run these commands:
reg add "HKLM\SYSTEM\CurrentControlSet\Control\Lsa\Kerberos\Parameters" /v PacSignatureValidationLevel /t REG_DWORD /d 2 /f
reg add "HKLM\SYSTEM\CurrentControlSet\Control\Lsa\Kerberos\Parameters" /v PacSignatureValidationPolicy /t REG_DWORD /d 2 /f
reg add "HKLM\SYSTEM\CurrentControlSet\Control\Lsa\Kerberos\Parameters" /v PacSignatureValidationAudit /t REG_DWORD /d 1 /f
SPN - What is that ?
The SPN or Service Principle Name is a unique identifier for a service instance in Active Directory, this is like a "trusted" map that will tell the domain:
- Which service account is allowed to run a specific service
- On which specific computers that service can run
SPN - How can I check?
Well, if you wish to check the SPN you can use this command:
setspn-L <serviceaccount>
The follows the primciple of:
serviceclass/hostname
Therefore as an example if this applies to a web server this should look something like this:
http/webserver1.bear.local
http/webserver2.bear.local
SPN - Lets check our "live" one....
This is the result of the command:
setspn -L srv_iis_app1
This is not good, nothing is registered:
Registered ServicePrincipalNames for CN=srv_iis_app1 ,OU=ServiceAccounts,DC=bear,DC=local:
<empty>
SPN - What if it is missing?
If you have missing or invalid SPNs, the service account can't properly authenticate for delegation when enforcement is enabled. SPNs act as identifiers that allow the service account to request and forward Kerberos tickets, this can exhibit behaviours including but not limited to:
SPN - Missing or not "allowed" to register?
Regular users cannot modify SPNs. Only "Domain Admins" and "Account Operators" have permission to register SPNs.
This is controlled by the ACL labelled "Write servicePrincipalName" permission in Active Directory Users and Computers.
This is a security feature to prevent unauthorized users from registering services that could be used for impersonation or privilege escalation attacks.
SPN - How can I add that servicePrincipalName write permission?
That is actually very simple, you can use this command, just replace the bold section with the correct DN to that account.
dsacls "<Service_Account_DN_Path>" /G SELF:RPWP;"servicePrincipalName"
This will then grant that account access to write its own SPN, which is critical for this enforcement policy when it comes into effect in April 2025.
Note : Domain Admins and Account Operators are always write SPNs manually without setting that permission.
SPN - Registration (with command)
When you have the permissions to set the SPN assigned then many services will automatically register their SPN on the next application start-up so once the permission is set.
However if you need to manually register this you can do so with a command for this example like this:
setspn -s HTTP/webserver1.bear.local srv_iis_app1
setspn -s HTTP/webserver2.bear.local srv_iis_app1