Вы находитесь на странице: 1из 15

THE TOP 5 .

NET MEMORY MANAGEMENT MISCONCEPTIONS


.NET Memory management is an impressively complex process, and most of the time it works pretty well. However, its not flawless, and neither are we developers, so memory management problems are still something that a skilled developer should be prepared for. And while its possible to have useful information about .NET memory management and write better code without fully understanding the black box inside the framework, there are a few common misconceptions which need to be dispelled before you can really get started: 1) 2) 3) 4) 5) A garbage collector collects garbage Doing lots of gen0 collections is bad Performance counters are great for understanding what is happening .NET doesnt leak memory All objects are treated the same

Misconception #1: A Garbage Collector collects garbage


The run-time system has a notion of objects which it thinks its going to touch during the rest of its execution, and these are called the live, or reachable objects. Conversely, any object which isnt live can be regarded as dead, and obviously wed like to be able to reuse the memory resources that these dead objects are holding in order to make our program run more efficiently. So it is perhaps unintuitive that the focus of the .NET Garbage Collector (the GC, for short) is actually on the nongarbage; those so-called Live Objects. One of the essential ideas behind the GC strategies that most people implement is that most objects die young. If you analyze a lot of programs, you find that, typically, a lot of them generate temporary objects while theyre doing some calculation, and then produce some other object to represent the results of that calculation (in some fashion). A lot of these young objects are therefore temporary, and are going to die quite quickly, so you want to design your GC to collect the dead items without having to process them all individually. Ideally, youd like to only walk across the live objects, do something with those to keep them safe, and then get rid of all the objects which you now know are dead without going through them all one by one. And this is exactly what the .NET GC algorithm does. It is designed to collect dead items without processing them individually, and to do so with minimal disruption to your system as a whole. This latter consideration is what gave rise to the generational model employed by the .NET GC, which Ill mention again shortly. For now Ill just say that there are three generations, labeled Gen0, Gen1 and Gen2, and that new objects are allocated to Gen0, which were going to focus on as we take a look at a simple example of how the GC works:

A Simple Mutator:
Were going to try and illustrate what Ive just explained using a simple C# program; a Mutator, as its called. This program makes an instance of a collection, which it then assigns into a local variable, and because this collection is assigned to the local variable, itll be live. And because this local variable is used throughout the execution of the While loop you can see below, itll be live for the rest of the program:

var collect = new List<B>(); while(true) { collect.Add(new A()); new A(); new A(); }

Listing 1 A simple mutator What were doing is allocating three instances of a small class weve called A. The first instance well put into the collection; it will remain live because its referenced by the collection, which is in turn referenced by the local variable. Then the two other instances will be allocated, but those wont be referenced by anything, so theyll be dead. If we consider how this allocation pattern looks when we study Gen0, well see it looks something like this:

Allocation in Generation 0

Figure 1 A hypothetically empty Generation 0 Here were assuming that Gen0 is empty at the point when our function starts running. Now of course that isnt ever true; when you start the CLR up, the base class libraries are loaded right away, and they allocate a lot of objects on the heap before the point at which your program runs. Well ignore those for the moment (for the sake of clarity), and well also ignore the collection object that we assigned to that local variable in our first step. So lets look at the While loop, which we know is allocating 3 instances every time it iterates weve coloured the instance that is referenced by the collection black to mark it as a live object, and then the two other instances, which are the dead objects, are coloured red.

Figure 2 The initial allocations by the simple mutator (not to scale) Well obviously have the same pattern in the second iteration; allocate one live object, followed by two dead objects - and then the third iteration will be the same again. Looking at Figure 3, you can clearly see that, as we go into the fourth iteration, well find we have no more space left in Gen0, so well need to perform a collection.

Figure 3 Generation 0, filled to capacity

No Space? Copy
When dealing with Generations 0, 1 and 2 (i.e. the Small Object Heap) the .NET CLR uses a copying strategy; in this instance, it tries to promote the live objects out of Gen0 and into Gen1. The idea is to find all of the live objects in Gen0 and copy them into some free space within Gen1 (which were also assuming is empty, for clarity). Bear in mind that this is just the first step in the collection process, which applies in the same way when the GC is working from Gen1 to Gen2, and is illustrated using the arrows below:

Figure 4 Copying live objects from Gen0 to Gen1 (or Genn to Genn+1) At this point, the GC needs to go through other objects on the heap which reference our instances (such as our collection object), and fix up the pointers from those objects to now point to the new place where the referenced objects have been copied to. This is a slightly tricky step, because at the point where the GC is copying the objects, it needs to also make sure that no threads are actually manipulating those contents. If they were, then it might miss updates that were made to the old versions of the objects after it had copied them, or it might actually modify pointers between objects in such a way that it forgets to copy some object forward. In order to manage this, the .NET runtime brings the threads to whats known as safe points. Essentially, it stops the threads and gives itself an opportunity to redirect these pointers from old objects in Gen0 to the newly created copies in Gen1.

Figure 5 Updating pointers to reference the newly copied object, one Generation up. Of course, the cool thing is that once the GC has done that, it can now recycle the whole of Gen0, and can do so without individually scanning the objects that it used to hold. After all, it knows that the live objects have been safely promoted and are correctly referenced, and everything else is dead, and thus

irrelevant. So, assuming most objects die young, weve only had to process a very small number of objects in order to recycle the whole of Gen0.

Figure 6 Generations 0 and 1, post collection.

Observations
The basic trick behind the .NET Generational GC is that objects are allowed to move (or rather, are copied). This is a great way to get them out of the way so that we can reuse their memory without having to process every object individually. It also means that the amount of time needed to perform a collection is proportional to the number of live objects which the GC has to move, rather than the number of dead objects in memory, which its going to ignore anyway. However, as a result of this system, we do have an overhead of needing to get all threads to a safe point. where we can fix up the pointers to reference the location where each object is copied to. This obviously has repercussions on the design of the run-time. For example, you need to have access to data that you get from the JIT and from the program itself, telling you the offset between the various objects at which you might find pointers, so that you can a) scan them to find the live objects, and b) fix those up at some later time. A term you occasionally hear associated with this promotion policy and its effects is bump allocation, which just means that we have the handy ability to allocate things very quickly. If Gen0 starts out completely blank then, when we want to allocate our first object, what we have to do is increment the pointer which is initially pointing to the beginning of the generation by a number of bytes corresponding to the size of that first object. That way, we then know that we can immediately place the next object at that newly offset location, and move the pointer along by the size of that object, etc. This gives us that clean, stacked layout which we saw in the earlier figures, where the objects all occur one after the other. Now it turns out that its not quite as easy to do all that as you might think, because there can be multiple threads, and if you want to avoid locking during the allocation of the object and incrimination of the pointer, you need to do some trickery to ensure that you dont need to do any thread locking on the fast path (the path you normally take). This is mentioned in more detail in the downloadable webinar that accompanies this discussion, but I wont go into it here.

One Other Trick


There is another trick that the .NET runtime uses for small objects (i.e. < 85k1), and thats the generational structure which weve already encountered. It divides the objects into Gen0 (where it puts the new objects), Gen1 & Gen2 (the latter being where the very old objects go, moving up the generations with each garbage collection they survive). In other words, it assumes that the longer objects survive garbage collections, the longer they are likely to keep surviving. Bearing in mind what
1

Objects which are > 85k in size are another matter altogether, and we wont worry about them for now.

I said earlier about how most objects die young, this structure means that the GC can focus its attention on doing GCs of just Gen0 (i.e. just a sub-set of the available memory), which is where we expect to get the greatest return in terms of recycling dead memory.

Misconception #2: Lots of Gen0 Allocation is Bad


This myth is almost an extension of the material that weve looked already at. Weve seen that the time to do a Gen0 collection is proportional to the amount of live data in that Generation, although there are some fixed overheads for bringing the threads to a safe point. Moving lots of objects around is an expensive thing, but just doing a Gen0 collection is not necessarily an inherently bad thing. Imagine a hypothetical situation whereby Gen0 became full, but all the objects taking up the space were dead. In that situation, no live objects would be moved, and so the actual cost of that collection would be minimal. The basic answer is that doing lots of Gen0 collections is very probably not a bad thing to do, unless youre in the situation where all the objects in Gen0 are live, in which case you end up allocating them first to Gen0, and then immediately copying them up to Gen1 (known as double allocation).

Misconception #3: Performance Counters are Accurate


Windows comes with a notion of a Performance Counter; this is just some statistic that gets periodically updated by the system, and you can use tools to look at these values to try and deduce whats happening inside the system. The .NET framework offers a number of these, which you can use various tools to take a look at in pseudo-realtime. From the point of view of memory management, using these performance counters can give you a feel for whether your application is behaving like a typical application, and objects are dying young. If youre curious about what typical means, the various bits of Microsoft documentation available online collectively have a reasonably good measure. For example, it is commonly stated that you should probably expect a ratio of Gen1-to-Gen2 collections of about 10:1. There are very useful performance counters, which well touch upon later, which are to do with allocation rates. Examples of these include the values of Bytes in all heaps (which tells us the total amount of all allocated objects), time spent in GC, and allocated bytes per sec, all of which we can graph through the freely available Perfmon tool. Essentially, theres a lot of data you can get to, which is all being maintained by the system to give you, as I mentioned, a sort of pseudo-realtime feel for how your application is behaving. On the face of it, thats fine, but weve listed this as a misconception because there are a couple of problems with the way your data is collected and displayed.

Periodic Measurements
First of all, its important to remember that these counters are updated periodically, and in particular the .NET memory ones are only updated when a collection happens. That means that if no collection is happening, then the counter is stuck at its current reading. This means that things like the average values you see in Perfmon are not really telling you exactly whats happening inside your application, although theyre admittedly better than nothing. To demonstrate some of this, Ive written a simple C# program that has the same basic structure that we saw before: we make a collection object, assign it to a local variable, and we allocate instances of a small class. This program class will take about 12 bytes on x86. However, we constrain the allocation rate, and only allocate one of these objects once every millisecond. Naturally, with this

accumulation, and given the capacity of Gen0 being 1 or 2 MB, its going to take quite a few seconds before we fill Gen0 up and provoke a collection:
class Program { static void Main(string[] args) { var accumulator = new List<Program>(); while (true) { DateTime start = DateTime.Now; while ((DateTime.Now - start).TotalSeconds < 15) { accumulator.Add(new Program()); Thread.Sleep(1); } Console.WriteLine(accumulator.Count); } } }

Listing 2 A simple C# allocating objects at a constrained rate

Figure 7 Apparently spiking allocations. If you look at such a program running under Perfmon, instead of seeing a constant allocated bytes per sec counter (which is what we know our program is actually doing), due to the periodic nature of the measurement driving the counter, it looks as if the allocation rate is just spiking whenever collections happen.

Figure 8 Visualizing the varying generation sizes. Its also important to remember that the runtime itself is measuring whats happening. Every time a collection happens, it works out what percentage of the objects survived in order to adapt, choosing optimal sizes for the various Generations, and trying to maximize throughput. So, if you graph some things, like the various heap sizes, youll find you get misleading figures. For example, you can see in Figure 8 that the system decided to enlarge Gen2 by a massive amount, and then chose later to shrink it down again. In short, even though Perfmon is giving us this pseudorealtime feel, what it shows us is not necessarily exactly how the application itself is behaving.

Getting Lower
In order to really see how your application is behaving, you need to dive into it and look at things at the object or type level, and there are several ways to do this.

Figure 9 Investigating memory at the object level with WinDbg.

Figure 10 Investigating memory at the object level with WinDbg.

The first one, which is illustrated in Figure 9 and Figure 10, is to use WinDbg, which is part of the debugging tools for windows, and which you can attach to a running executable. The .NET framework itself comes with a debugger extension, called SOS, which you can load into WinDbg and which then allows you to scan the heaps and find details about the objects they contain. Essentially, loading that DLL makes a whole set of extra commands (which know about .NET memory layout) available to the debugger. In Figure 10, for example, were dumping all objects of the type program, and it will tell us (for example) that there were 3953 instances of that type on the heap at the point when I took this snapshot. It will also show us that each instance is taking up 12 bytes of memory. Now, if we consider a particular instance, we can use commands like GCRoots to try and relate that object back to the root thats actually keeping it in memory, and that path will show us how its being kept alive, which can be pretty useful information.

Figure 11 ANTS Memory Profiler displaying performance counters. I just want to quickly mention in passing that there are other tools that allow you to do this Ill inevitably nod to our own ANTS Memory Profiler as an example. All of these tools try to make it easier to deal with the vast amount of information available in memory debugging. In the case of ANTS, it tries to first show us the information from the various performance counters at the top of the screen (See Figure 11 above) in order to guide us to a point in time at which we might want to take a snapshot (which is a dump of all the objects in the heap). The profiler then has tools to allow you to compare your snapshots to try and work out which objects have survived unexpectedly, and which objects have been allocated in vast numbers when you dont expect it. It also allows us to do the rootfinding trick we saw a moment ago, but in a much more graphical way (see Figure 12)

Figure 12 Using ANTS Memory Profiler to find an objects roots So, to wrap up the discussion of performance counters, you can use WinDbg, which gives you lots of information but is hard to navigate, or you can try and use more graphical tools, which offer you filtering and a means to graphically explore the contents of the heap at the point when you took the snapshot. Either way, always bear in mind that the performance counters that these are based on are not necessarily representative of whats going on within your application in real time.

Misconception #4: .NET Doesnt Leak Memory


In one sense, this statement is literally true, but the there are problems in .NET which have the same symptoms. Its ultimately a question of definition:

Old Definition
When you used malloc and free to manage memory yourself, a leak was what happened any time that you forgot to do the free part. Basically, youd allocate some memory, youd do some work with it, and then youd forget to release it back to the runtime system. Or maybe you were dealing with a very large data structure, and you couldnt work out what the actual root node into that structure was, which made it very hard for you to start freeing things. Or maybe you called into a library routine which gave you some objects back, and it wasnt quite clear if it was you or the library that would later free those objects.

In addition, prematurely releasing objects was often fatal. Say you allocated an objected, freed it, and then continued to try and use it. If the memory space you were trying to access had since been allocated to a different object, youd find yourself with two objects of different types competing over the same memory, and that would often cause things to go catastrophically wrong.

New Definition
The good news is that those days are gone2, and that the .NET runtime, which takes care of freeing objects for you, is also ultra-cautious. It works out whether it thinks a particular object is going to be needed while your program runs, and it will only release that object if it can completely guarantee that it is not going to be needed again. The difficulty with this, of course, is that its difficult to have an effective cost model in your head, describing when objects that you allocate are actually going to be freed again, so understanding your own code can pose its own challenges! Moreover, while this managed memory is a boon, it also has opportunities and loopholes which allow objects to live longer than they should.

What Makes Things Live Longer?


There are a few things that might cause your objects to live longer than necessary, causing your application to take up much more memory than you think it should:

The Runtime Itself


The type of build you use can have an effect on object persistence. For example, if you choose to do a debug build instead of a release build, you can find that your objects are kept much longer than they really need to be, because the compiler hasnt been as aggressive in its optimization. Having a debugger attached can also make things live longer, as your application will be JITed in such a way that the local variables live to the end of the function, making it easier to debug things. Weve already seen that the .NET runtime has a number of heuristics for deciding when to collect higher generations, and it turns out that Gen2 objects will not be collected very often. So, should you have an object that accidentally gets promoted into a higher Generation, it can be a long time before that object gets freed, and its memory gets recycled. Finalizers are another culprit; in order to implement one, the system waits until an object with a finalizer would have been collected, and then saves that object by promoting it. This means that this object will survive until the collection of the next generation, and perhaps longer if the finalizer thread doesnt get around to processing it in time.

User Interfaces, and How They Behave


Event handlers are a typical example of this if you subscribe a delegate to an event handler on a long-existing item, say a top-level Windows form, youre basically adding a reference to both an object and a method on that object (which is essentially what a delegate is). In short, youre essentially making your object live as long as the top-level form. If you later forget to unsubscribe the object, youll find that youve introduced a memory leak (in the form of an unnecessarily long-lived object). There are other examples of this kind of mishap, but theyre not terribly common, so I wont dwell on them here.

Except when you aggressively dispose

Libraries
Youll find some libraries will have caches within themselves to improve their performance, but they may not have a very good lifetime policy on those caches. So you might find that youre unintentionally keeping the last 50 results, or something like that. Even if you dont call into that library for a long time, the cache is still going to stay live, and all of those objects are still going to be around.

The Compiler
My favorite of all of these problems has to do with the way the compiler translates more modern constructs in C# to run on the CLR 2 infrastructure that the .NET framework provides. Closures and Lambda expressions are a very good example. Lambda expressions are not represented as objects in themselves at the level of IL in the CLR, but are represented as compiler-generated classes which are used to maintain references to what were the local variables.
class Program { private static Func<int> s_LongLived; static void Main(string[] args) { var x = 20; var y = new int[20200]; Func<int> getSum = () => x + y.Length; Func<int> getFirst = () => x; s_LongLived = getFirst; } }

Listing 3 Illustrating compiler translation with a lambda expression. In this simple example, we have two local variables which are referenced by a lambda expression, which itself lives for a very long time by being put in a static field. Now, in order to make the lifetime of these local variables match the lifetime of the lambda function, the C# compiler actually generates all this by wrapping the local variables into a class, of which it makes an instance, and the compiler then represents the lambda functions as delegates on that class. So in the case of this example, even though the local variable Y doesnt need to live for a long time (because the lambda expression only refers to the variable X), well find that, due to the way the C# compiler behaves, this large array will live for a very long time.

Figure 13 Using .NET Reflector to see how the C# compiler unintentionally keeps objects alive unnecessarily. If we look in .NET Reflector to see how that code is generated, we see the compiler has generated this extra display class (see Figure 13, above), and the local variables are actually represented as fields within that display class. However, there is no effort to clear out those fields, even when the system knows that their values cant be accessed in the future.

Misconception #5: All Objects are Created Equal


Finally, as we saw hints of in Misconception #2, its a bad idea to use the copy-&-promote strategy we saw earlier for very large objects, because it takes a very long time (and is very expensive) to copy those objects around. So the designers of the .NET memory management system needed a better way to handle these large objects, and resorted to a standard technique called mark-and-sweep. Rather than promoting live objects to another generation, the GC leaves them in place, and keeps a record of the free areas around them, and then uses that record at a later stage to allocate objects. Crucially, there is no compaction at the moment.

No Copying During Collection


Lets take that simple mutator I wrote earlier, and instead of allocating instances of small objects, we allocate instances of large ones (bear in mind that anything larger than 85k bytes will typically be allocated in the Large Object Heap.) Going through the same scenario as we saw before, we have a series of instances, live and dead, being allocated in cycles:

Figure 14 Allocating larger objects to the heap

At the point that garbage collection happens, rather than moving these live objects into a new generation, the GC just makes a note of where they are, and then scans through the dead objects, noting their address ranges as free blocks, which itll try to use later for allocation requests.

Figure 15 The postcollection heap, with the free blocks noted by the GC. Once again, crucially, there is no copying everything is left in place.

Some Observations
That has some advantages for starters, there is no movement, so we dont have to do any fix-ups, and we dont need to bring other threads to a safe point in order to adjust their pointers. Thats also given us a potentially parallelism advantage. The trouble with this model is that its introduced potential fragmentation, and well see in just a moment what fragmentation really means. The GC also now has to make a decision regarding at what point it actually does collections in this Large Object Heap, and it was decided to make this area synonymous with Gen2, at least for GC purposes. This means that whenever a collection of Gen2 occurs, so too does a collection of this Large Object Heap. The repercussion of that decision is that temporary large objects dont really fit into this model. For small objects, it was fine to generate things temporarily, as theyd be very quickly and cheaply recycled, but for large objects thats clearly not the case. So if we generate temporary large objects, it can be a very long time before a Gen2 collection is carried out, and those temporary objects will be holding memory throughout that period. We also saw that, in order to find the free blocks of memory, the GC has to walk all of the objects on this heap. This behavior is much more expensive from a paging perspective, as were actually touching a lot more of the live memory.

The Problem of Fragmentation


This can be illustrated if you have a program thats allocating objects in a live-dead-live-dead pattern, as seen below.

Figure 16 An allocation pattern likely to generate a certain degree of fragmentation. After collection, those dead objects are all marked as free space, ready to be recycled:

Figure 17 Postcollection memory, now fragmented.

The problem occurs when you try to allocate an object thats actually slightly bigger than any of these free blocks...

Figure 18 A hypothetical object, larger than any of the available free slots. Obviously, the GC finds that this new object wont fit into any of these free areas. So, even though there is enough free memory available to satisfy the request for an object of that size, the GC will find it doesnt actually have anywhere to put the new object, and will be forced to resize the LOH to make allocation possible:

Figure 19 The effect of memory fragmentation.

Conclusion
Weve looked at 5 different issues you might have with your .NET memory management, and tried to tell you a little bit of the story and history behind each of them. I think the conclusion is really that theres a lot going on inside the heap of your process, and ideally you need to be able to visualize whats going on to be able to understand why things are being kept alive longer than you think.

Вам также может понравиться