Вы находитесь на странице: 1из 25

Windows 7 Memory

Management
Landy Wang
Distinguished Engineer
Microsoft Corporation

Topics
>

Working set management

>

Fine grained page locking

>

Security

>

NUMA

>

Non volatile (flash) memory

>

Handling of contiguous/large page memory requests

>

High end servers

>

Footprint and performance

Working Set Background


>

Optimal usage of system memory - a constant area of


investment !

>

Working set : Comprises all the potentially trimmable virtual


addresses for a given process, session or system resource.

>

Resources like nonpaged pool, kernel stacks, large pages &


AWE regions are excluded (because they are not trimmable).

>

Working sets provide an efficient way for the system to make


memory available under pressure ... but maintaining them is
not free and care must be exercised during trim candidate
selection ... and the subsequent writing of those pages !

>

Trimmed pages go to the standby (clean), modified or zero


page lists. The modified/mapped writer threads write them
in a timely fashion.

Working Set Aging/Trimming


>

Working sets are periodically aged to improve trim decisions

>

Which sets and which virtual addresses to trim ?

>

How much to trim ?

>

Memory events so applications can (optionally) participate ...

Working Set General Policies


>

When memory is low, how are working sets managed equitably and
efficiently so optimal usage is achieved ?

>

Working sets are ordered based on their age distribution.

>

Trim goal is set higher to avoid subsequent additional trimming.

>

After goal is met, other sets continue to be trimmed but just for
their very old pages. This provides fairness so one process doesnt
surrender pages and the others do not.

>

Up to 4 passes may be performed, later passes consider higher


percentages of each working set and lower ages (more recently
accessed) as well.

>

When trimming occurs, all sets are also aged so future trims will
have optimal (and fair !) candidates.

Working Set Improvements


>
>
>
>
>
>

>
>
>

Expansion to 8 aging values (up from 4)


Keep exact age distribution counts instead of estimates
Force self-aging and trimming during rapid expansion
Dont skip processes due to lock contention and ensure fair
aging by removing pass limits
Dont ravage small sets since subsequent hard faults
penalize all sets
Separation of the system cache working set into 3 distinct
working sets (system cache, paged pool and driver images)
to prevent individual expansion from trimming the others
Factor in standby list repurposing when making age/trim
decisions
Improved inpage clustering of system addresses
Result : Doubling of performance in memory constrained
systems !

Task Managers Main Screen

Task Manager Working Set Display

PFN Lock Background


>

The PFN (page frame number) array is a virtually contiguous


(but can be physically sparse) data structure where each PFN
entry describes the state of a physical page of memory.

>

Information includes :
>
>
>
>
>
>
>

- State (zero, free, standby, modified, modifiednowrite, bad, active, etc)


- How many page table entries are mapping it
- How many I/Os are currently in progress
- The containing frame/PTE
- The PTE value to restore when the page leaves its last working set or is
repurposed
- NUMA node
- etc

Size is critical ... And how to best manage the information ?

PFN Lock : The Problem


>

The huge majority of all virtual memory operations were


synchronized via a single system-wide PFN lock. Thus even
seemingly unrelated operations by threads, even those in
different processes, would contend for and serialize at this
lock, potentially causing significant performance
degradation/spikes.

>

Larger numbers of processors and memory sizes intensify the


lock pressure. For example, prior to this change SQL Server
had an 88% PFN lock contention rate on systems with 128
processors.

>

Applications and device drivers seeking higher performance


faced significant complexity at best : AWE, large pages, or
even complete algorithmic redesigns.

PFN Lock : The Scope


>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

All page allocation, deallocation, access and state manipulation


All prefetching, prioritization, access logging and page identification
All page list manipulation (zero, free, standby, modified, modnowrite, bad)
All pagefile space allocation/deletion, adding/expansion/contraction
Page fault management, trimming/theft/replacement, mapped/modified
writing, flushing, purging
All control area, segments, subsections and prototype PTE usage
Virtual address space deletion/decommit, protection changing, trimming,
large pages, etc
Process/session creation, duplication, inswap/outswap, deletion
Kernel stack creation/deletion, inswap/outswap, stealing
System cache view mapping/unmapping/readahead, protection
Image validation, ASLR dynamic relocations
MDL probing/unlocking
Driver loading, unloading, paging
User event signaling (low memory, high memory, etc)
Dynamic addition/removal of memory plus mirroring/hibernate/resume
Dynamic kernel virtual address space allocation/deletion/initialization

PFN Lock : The Answer


>

In Windows 7, the systemwide PFN lock was replaced with


fine-grained locking on an individual page basis.

>

This completely eliminated this bottleneck, resulting in much


higher scalability. For example, the Usenix memclone
microbenchmark is now 15x faster than Windows Server
2008 on 32 processor configurations.

>

Fully compatible (on a binary and source level) so all software


benefits without any changes. Developers dont need to
resort to complex workarounds to achieve highest
performance !

PFN Lock Replacement Hierarchy


>
>
>
>
>
>
>
>
>
>
>
>

Pool locks
System VA lock
Working set expansion list lock
Individual per-page locks
Access logging lock
Page list (free per color, zero per color, standby per priority,
modified filesystem/pagefile destined, bad) locks
Per-pagefile space lock
Memory event signaling lock
Per-control area lock
Dynamic relocation VA (ASLR) assignment lock
Segment list lock
Section object pointers lock

Security : ASLR Background


Image header
Executable
Load Address
+/- 16MB

Executable

Randomly Chosen
Executable
Load Address

DLL Loading

DLLs
Up to
16MB

Kernel Mode

Randomly Chosen
Image-Load Bias

Security : ASLR Background


>

Images relocated dynamically when each image section is


created.

>

When combined with NX, makes life difficult for hackers !

>

Compresses VA space to reduce page table page cost as well


as provide a larger contiguous VA range for applications.

>

Introduced in Vista, applications (for compatibility) must opt


in via /DYNAMICBASE.

Security : ASLR Improvements


>

Driver randomization increased to 64 possible load addresses


for 32-bit drivers and 256 for 64-bit drivers, up from 16 for
both.

>

Kernel, HAL and session drivers relocated post-Vista RTM.


Large session drivers (win32k.sys for example) are also now
relocated.

>

Extra effort is also made to relocate user space images even


when system VA space is tight/fragmented by temporarily
using the user address space of the system process.

>

The memory cost of ASLR has also been reduced by adding


2x compression for in-memory image relocation tables, which
saves at least 11MB of pagable memory on every system.

>

Allow execute revocation (for NX-optin on the fly) post-Vista


RTM.

NUMA
>

NUMA is the approach preferred by hardware designers to


achieve optimum performance.

>

Typical far node cost : clients 1.3-1.7x, servers 1.1x-3x+ !

>

Windows 7 adds support for 64 NUMA nodes (up from 16).

>

Node graph construction so optimal allocations can always be


performed automatically without drivers/apps doing heavy
lifting.

>

Apps can specify node preference on allocation/view/control


area/thread/process boundaries.

>

Automatic page migration performed by the system !

Integrated NVRAM Support


>
>
>
>
>

NVRAM :
- Built directly into motherboards
- In solid state drives
- In USB sticks
- As a replacement for main memory

>

Windows 7 delivers tight and efficient integration of NVRAM


support directly into the core memory management system.
Eliminates numerous filter driver drawbacks some
examples :
The same disk page can be in memory, in a ReadyBoost
cache and pinned in a ReadyDrive disk all at the same time
with each component unaware of the others.
Pagefile-backed pages can be consuming space in both
ReadyBoost and ReadyDrive caches even though the
application (and memory management) had deleted them
long ago.

>
>

>

Contiguous/Large Page Memory


>

Significant redesign post Vista RTM to obtain memory


efficiently without trimming, issuing I/O or inserting fault
delays. In memory pages are swapped in place. Efficient
scanning including range skipping during preliminary pass
ensures much higher yield results.

>
>
>

Applications (ie, databases) allocating large page regions.


Hypervisors allocating memory for guest VMs.
Device drivers making contiguous memory or
MmAllocatePagesForMdl* calls.

>

Result : Reductions in allocation times can be several orders


of magnitude ! Callers no longer run the risk of disrupting
the entire system !

High-End System Support


>

Initial 64 bit nonpaged pool maximum bumped from 40% to 75%.

>

Reclaim initial nonpaged pool (up to a 3% RAM boost !).

>

Boot time reductions by not depleting executive worker queues for


page zeroing (at the expense of boot time forward progress).

>

TLB flush reductions (especially valuable for virtualization).

>

Cache management improvements to avoid flushing/overflushing.

>

Enterprise clustered filesystem support APIs added.

>

Software mirroring for major OEMs (also in WS2008).

>

Avoid issuing modified page writes until absolutely necessary post


Vista RTM.

Footprint Analysis
>

Vista SP1 memory management code is ~460KB (25% is


pagable or INIT). Windows 7 total code growth is 8KB !

>

Static data reduced from 41KB to 38KB.

>

Multiplicative data structures even more important !


Significant effort into saving at this level relocation tables
are one example.

>

Locality of reference improvements for speed, false sharing


elimination and footprint purposes.

Focus Areas
>

Footprint / locality of reference

>

Memory and I/O prioritization and efficiency

>

Parallelism

>

Scalability

>

Security

>

Power consumption

>

New technologies - hardware and software

Questions
?

2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S.
and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond
to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after
the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Вам также может понравиться