Well, I got notification of a high CPU, usually these are useless but this time it was slightly helpful from a alert point of view, so on the server and upon logging into the server to take a peek I noticed this:
Full System scan in "core" hours? Yes this is the Defender, but why the time was 10:30 in the middle of the day that makes no sense no one would schedule a full system scan for business hours on a critical server would they?
I have a policy that does a full system scan on a Saturday out of hours, but not on a working day in the middle of the day - so what is going on here?
Check Event Log
We need to look in this event log location below, to find what is going on:
Microsoft-Windows-Windows Defender/Operational
Event Log : Confirms scans are not running, but are they?
Once here if you look for Event ID 1001 this will tell you all the scans that have completed as you can see below, the first quick scan completed in 48 seconds:
Microsoft Defender Antivirus scan has finished.
Scan ID: {545EE7BC-8706-4513-B05B-4DB8FDA70AB6}
Scan Type: Antimalware
Scan Parameters: Quick Scan
User: NT AUTHORITY\NETWORK SERVICE
Scan Time: 0:48:54
The last full system scan ran on the Saturday as per the policy and completed in 9 hours and 5 minutes, which is slightly longer than normal:
Microsoft Defender Antivirus scan has finished.
Scan ID: {D02F8AA3-39A2-4274-9690-42F17EB4380C}
Scan Type: Antimalware
Scan Parameters: Full Scan
User: NT AUTHORITY\SYSTEM
Scan Time: 9:05:18
I have confirmed that the exclusions are setup in the policy and they are enforced and working, so why is Defender taking the majority of server resources and this server is also running another application that can be CPU intensive.
Check the Disk Usage on the "sluggish" server
If we look at the disk usage the performance is not massively awful but the Active time is elevated and the read and write is above average as well
That may have been other services or applications on the server as the usual trend look like this, which is normal for any server:
Notice the sluggish and slow performance
However the key here is the server is sluggish to respond and certain actions take a couple of seconds which is usually indicative of a full system scan going on, even though the event log tells us that it completed on the weekend.
Can we temporarily reproduce slow/sluggish on another normal server (with Defender) ?
Let check this out, take a normal server that is responsive and then from a command prompt (elevation is not required) run these commands, which will start a Full system scan on that server:
cd C:\Program Files\Windows Defender
mpcmdrun -Scan -ScanType 2
That will enter the event log like this:
Microsoft Defender Antivirus scan has started.
Scan ID: {92820493-4078-4BA8-B964-EBE39FDB8EDC}
Scan Type: Antimalware
Scan Parameters: Full Scan
Scan Resources:
User: BEAR\Lee
Confirm you get the same Slow/Sluggish performance
Once you issue this command it is will remain interactive with "Scan starting..." on the screen but do you notice after you do this that laggy and unresponsive behaviour is not present on this servers that was responsive a moment ago?
Does this server have a scan running now then ?
This is a reproducible issue then with a full system scan running, so back on the server that should not be running a full system scan, from a command prompt enter this:
cd C:\Program Files\Windows Defender
mpcmdrun -scan -cancel
This will then tell you that a scan HAS been detected and it will cancel it:
Argument -cancel detected. Trying to cancel any quick/full scan in progress...
Scan cancelled successfully.
You will notice in the event log that you get this event, that confirm the full system scan that should not be running has stopped:
Microsoft Defender Antivirus scan has been stopped before completion.
Scan ID: {F43A68C2-F767-4187-A15E-FCD1B2EE85DD}
Scan Type: Antimalware
Scan Parameters: Full Scan
User: NT AUTHORITY\SYSTEM
The moment you issue that command look what happens in the CPU department, everything magically returns to "normal" of the server:
Why did this scan happen outside of the policy window?
Well, for that we need to navigate on that server to this directory:
C:\ProgramData\Microsoft\Windows Defender\Support
When here you will notice a couple of files that will be of interest to this investigation, these are shown below:
First lets start with the MPLog-20240826-033502.log file which is the runtime log of Defender and around mid morning I noticed this issue of being unable to scan a file called Utils.ps1 in the OMS agent directory which should really be excluded from the scan targets:
2024-08-28T10:53:31.239 ReportLowfi(\Device\HarddiskVolume3\Program Files\Microsoft Dependency Agent\plugins\lib\Utils.ps1->(UTF-8), 0x4baeea3a) from 0x0004fcbda1bc0c5c
2024-08-28T10:53:31.239 Lua SetAttribute:Filter caching disabled for \Device\HarddiskVolume3\Program Files\Microsoft Dependency Agent\plugins\lib\Utils.ps1->(UTF-8) (runtime MpDisableCaching from 0x0004fcbda1bc0c5c)
2024-08-28T10:53:31.254 ReportLowfi(\Device\HarddiskVolume3\Program Files\Microsoft Dependency Agent\plugins\lib\Utils.ps1, 0x4baeea3a) from 0x0004fcbda1bc0c5c
2024-08-28T10:53:31.254 Lua SetAttribute:Filter caching disabled for \Device\HarddiskVolume3\Program Files\Microsoft Dependency Agent\plugins\lib\Utils.ps1 (runtime MpDisableCaching from 0x0004fcbda1bc0c5c)
This then ends with a unsuccessful scan status which means the scan is not technically complete it internally failed:
2024-08-28T10:54:11.206 [RTP] [Mini-filter] Unsuccessful scan status(#19): \Device\HarddiskVolume3\Windows\Logs\CBS\CBS.log. Process: (unknown), Status: 0xc000004b, State: 0, ScanRequest #7987376, FileId: 0x3d700000001fe83, Reason: OnClose, IoStatusBlockForNewFile: 0xffffffff, DesiredAccess:0x0, FileAttributes:0x820, ScanAttributes:0x10, AccessStateFlags:0x1, BackingFileInfo: 0x0, 0x0, 0x0:0\0x0:0
If we now look at the MPScanSkip-20231106-235420.log file we notice that the scan was also marked as skipped:
2024-08-28T10:27:58.006 OnDemandScan skipped or partial scan for [pid:30896]. Reason[Scan Error]
2024-08-28T10:51:55.783 OnDemandScan skipped or partial scan for [pid:15736]. Reason[Scan Error]
This could be the reason that Defender started the scan again, but the failure of the OMS files was probably the cause of the failed scan, however they should be excluded in the path, which they now are.