Paging can cause bad interactive performance. This happens quite often but very little content exists how you can diagnose and fix paging issues. It is time to change that (a bit). I present you here a deep dive into how paging really works for some workloads and how that caused a severe interactive performance issue.
The Observation
Compress files in Windows XP can be removed from the registry, there’s skipping it when you run cleanmgr, I hated that option so much. Omnitrader crack. Like to point out in the rundown above on how to get there “Clean up system files” button at the bottom of the window will only be there when you have User Account Control enabled, something that gets disabled before I even install the first driver. To still disable HTTP/2, see below. (OP gave a helpful answer, but in the question. I moved the answer): I did work around my HTTP/2 problem by configuring Windows 10 HTTP.SYS in the registry to disable HTTP/2. Given that I didn't find info anywhere, I thought I'd share my solution to that problem here too. How To Enable/ Disable NTFS Compression In Windows To Improve Performance:- NTFS file compression is a feature available in Windows operating systems. The NTFS file compression feature plays a great role in compressing files to save space on NTFS hard drives. This feature works perfectly in some situations and with some file types.
It was reported that a software version did significantly worse than its predecessors when the system was under heavy load. Measurements did prove that hard page faults were the issue.
The easiest way to switch off automatic file compression in Windows 10 is through File Explorer. File compression can apply to single files, folders, or an entire drive. When you apply or remove file compression from a folder or drive, you can push the change throughout the entire directory, compressing (or uncompressing) every file in each.
A more detailed analysis showed that the big page out rate of nearly 500MB/s was caused by explicit working set trims. After removing these explicit EmptyWorkingSet calls performance went back to normal. The problem remained though why EmptyWorkingSet was suddenly a problem because these calls were there since a long time.
How Paging Really Works
To make such measurements on Windows 10 I give away some secrets. Some of you might have found the MM-Agent powershell cmdlet already. If you look e.g. at Windows Server 2012 Memory Management improvements then you will find some interesting switches which are also present in Windows 10 Anniversary. You can configure the Windows Memory Management via powershell.
On Windows 10 Anniversary the switches MemoryCompression and PageCombining are enabled by default. If you are blaming memory compression for bad performance you you can switch off MemoryCompression by calling
Disable File Compression Windows 10 Download
To enable it again you can call
The MemoryCompression switch seems not to be documented so far. If MemoryCompression is already enabled and you want to turn it off you need to reboot. If you have disabled memory compression successfully you should see in task manager as Compressed value zero all the time. If you enable memory compression the setting will take immediate effect (no reboot necessary).
Now lets perform a little experiment what happens when we trim the working set of two GB process with my CppEater when memory compression is disabled.
C:>CppEater.exe 2000
Below is the screenshot when CppEater did flush its working set. This caused some memory to be left over in the Modified list because the memory manager did not see the need to completely flush all of the 2 GB out into the page file.
Plist editor mac. When the memory was flushed you see an IO spike on disk D where my page file resides.
So far so expected. Below is a diagram which shows order of operations happening there.
In ETW traces this looks like this:
File Compression online, free
From that picture it is clear that when we then touch the paged out memory again we will need to read two GB of data from the page file. Now lets perform the experiment and see what happens:
Our active memory usage rises again by two GB but what is strange that there is zero disk activity when we access the paged out memory. From our mental model we should see two GB of disk IO on the D drive. Lets have a look at ETW traces while page file writing is happening.
While we write two GB of data to the page file the Standby list increases during that time by the same amount. That means that although the data is persistently written out to the page file we still have all of the written data in the file system cache in the form of the standby list. Since the contents of the page file are still in the file system cache (Standby List) we see no hard page faults when we try to access our paged out data again.
That is a good working hypothesis which we can test now. We only need to access files via buffered IO reads and do not close the file handles in between which will flush the file system cache. Then we should see our CppEater process hitting the hard disk to read its page file contents.
Below is a small application that reads the windows installer cache which is ca. 30 GB in size which is more than enough to flush any existing file system cache contents:
With a flushed file system cache CppEater.exe has no secret hiding place for its paged out memory anymore. Now we see the expected two GB of hard disk reads minus the still not flushed out modified memory. Cia tradecraft manual.
The picture what happens when data is written to the page file is lacking an important detail: It misses out the fact that the Modified list transitions to the Standby list which is just another name for the file system cache.
The Explanation For Bad Performance
Now we have all the missing pieces together. The initial assumption that flushing the working set cause the OS to write the process memory into the page file is correct, but only half of the story. When the page file data is written the memory from the modified list becomes part of the file system cache. When the pages are later accessed again it depends on the current state of the file system cache if we see soft or hard faults with dramatic effects for the observed performance. The bad performing software version did cause more buffered reads than before. That reads did push the cached page file data out of the file system cache. The still happening page faults were no longer cheap soft page faults but hard page faults. That explains the dramatic effects on interactive performance. The added buffered IO reads did surface the misconception that flushing the working set is a cheap operation. Flushing the working set and soft faulting it back again is only cheap if the machine is not under memory pressure. If the memory condition becomes tight or the file system cache gets flushed you will see the real costs of hard page faults. If you still need to access that memory in a fast way the best thing to do is to not flush it. Otherwise you might be seeing random hard page faults even if the machine has still plenty of free memory due to completely unrelated file system activity! This is true for Windows Server 2008 and 2012. With Windows Server 2016 which will also employ memory compression just like Windows 10 things change a bit.
Windows 10 Paging
With memory compression we need to change our picture again. The modified list is not flushed out to disk but compressed and then added to the working set of the Memory Compression process which now acts as cache. Since the page file contents are no longer shared with the standby list we will not see this behavior on Windows 10 or Server 2016 machines with enabled memory compression.
When we execute the same use case under Windows Server 2016 where we flush the file system cache with enabled memory compression we will see
that the memory from the MemCompression process stays cached and it is semi hard faulted back into the CppEater process in 3s which is much faster than the previous 10s when the page faults were hitting the hard disk. It is therefore a good idea for most workloads to keep memory compression enabled. It not only compresses the memory but the cached pagefile contents are no longer subject to standby list pollution which should make the system performance much more predictable than it was before.
Conclusions
Windows has many hidden caches which make slow operations (like hard page faults) fast again. But at the worst point in time these caches are no longer there and you will experience the uncached bad performance. It is interesting that RamMap does not show the page file as biggest standby list consumer on machines with no memory compression enabled. To prevent such hard to find errors in the first place you should measure what things cost with detailed (for me it is ETW) profiling and then act on the measured data. If someone has a great idea to make things faster you should always ask him for the detailed profiling data. Pure timing measurements can be misleading. If a use case has become 30% faster but you use x3 more memory and x2 CPU is this optimization still a great idea?
Comments are closed.
|
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |