14 minute read

It always feels like issues come in clusters. One week we get tones of cache related cases, next week everyone is stuck in some lock and so on. Lately I have had a number of issues where there were memory leaks related to event handlers.

Problem description

We have a pretty easy to repro memory leak, even with very few users memory grows at a high rate and the memory is never released.

The usual suspects are eliminated, i.e. we know that the app doesn’t cache much or store much or anything in session scope, and from performance monitor we can see that as private bytes grow, most or all of this growth is in the .net heaps (if private bytes grow by 100 MB, # bytes in all heaps also grow by about 100 MB).

Gathering data

Since we can tell with !dumpheap –stat exactly what objects are on the managed heap, the natural path to take here is to take two dumps as the memory grows and compare and see what types of objects increase the most.

Preferably to make the problem more visible, a stress test is nice so that the leaking objects stand out more. For example if we leak one dataset per page view, and you see one dataset in the dump it doesn’t really stand out, but perhaps 10.000 would.

When you take the first dump adplus –hang –pn w3wp.exe, do this after loading all relevant pages at least once so that all assemblies and dlls are loaded. After this, stress the application, and get another dump.

Debugging

As usual I have simplified the scenario a bit, but the debugging techniques stay the same.

I have a simple page, hardly any code in it at all, stressed it for just 14 seconds using ACT (got ~7000 page hits) and the memory grew by an amazing 64 MB. Of course I would recommend that you stress more thank 14 seconds, but in this case it was enough to give me a pretty good leak.

We can see how much memory was used by the process in the two instances by just looking at the size of the dump files. The first dump was at 65 MB and the second at 129 MB.

!eeheap –gc gives us information about the size of the managed heap along with some segment information etc. The first dump showed that we used 5 MB of .net heap, and the second (output below) shows that this has grown to 73,004,928 bytes (~69 MB), so an increase of 64 MB which matches the problem statement, that “all” our memory increase is on the .net heap.

0:000> !eeheap -gc
Number of GC Heaps: 2
------------------------------
Heap 0 (0x000b71a0)
generation 0 starts at 0x029f0288
generation 1 starts at 0x0225fb84
generation 2 starts at 0x02170030
ephemeral segment allocation context: none
 segment    begin       allocated     size
0x2170000 0x2170030  0x45f5000 0x2484fd0(38,293,456)
Large object heap starts at 0x0a170030
 segment    begin       allocated     size
0x0a170000 0x0a170030  0x0a17d810 0x0000d7e0(55,264)
Heap Size  0x24927b0(38,348,720)
------------------------------
Heap 1 (0x000d7008)
generation 0 starts at 0x0698ea28
generation 1 starts at 0x0629dd58
generation 2 starts at 0x06170030
ephemeral segment allocation context: none
 segment    begin       allocated     size
0x6170000 0x6170030  0x827d000 0x210cfd0(34,656,208)
Large object heap starts at 0x0b170030
 segment    begin       allocated     size
0x0b170000 0x0b170030  0x0b170030 0x00000000(0)
Heap Size  0x210cfd0(34,656,208)
------------------------------
GC Heap Size  0x459f780(73,004,928)

Trivia: Some other interesting points about this output (not relevant to the issue at hand though) is that we can see that we are running the server GC version on a dual proc machine (since we have 2 heaps). The heaps are nicely balanced (using ~34 and 38 MB’s respectively, if I forget for a moment that 1 MB is 1024 kb instead of 1000), and the application has not allocated any large objects so far. Hmm… then why does it show 55,264 bytes in the large object segment for Heap 0? Answer: In addition to large objects the array/s of statics is also stored here so that it won’t have to be moved around so much.

Going back to the issue… The next step for a managed memory leak is to run !dumpheap –stat on both dumps

To save some space, I’m only printing out the bottom most lines of the !dumpheap output as those are the most interesting.

Dump 1:

0x020fa3dc         81         3,888 System.Text.RegularExpressions.CachedCodeEntry
0x79ba9840        246         3,936 System.RuntimeType
0x02093c24         42         4,536 System.Xml.XmlNSAttributeTokenInfo
0x020f2aec         82         4,592 System.Text.RegularExpressions.Regex
0x79b9603c        239         4,780 System.Text.StringBuilder
0x0209081c        207         4,968 System.Xml.NameTable/Entry
0x020faa2c         65         5,460 System.Text.RegularExpressions.RegexParser
0x0209ef4c        473         5,676 System.Xml.XmlChildNodes
0x0206585c        120         5,760 System.Web.Caching.ExpiresBucket
0x0c3281c4        413         6,608 System.Web.Configuration.DelayedRegex
0x0c172974         66         7,128 System.Text.RegularExpressions.RegexInterpreter
0x0209c7cc        493         7,888 System.Xml.XmlAttributeCollection
0x0209f090        473         9,460 System.Xml.XmlChildEnumerator
0x0c32722c        474        11,376 System.Web.Configuration.CapabilitiesSection
0x79ba2b34        435        12,180 System.Security.SecurityElement
0x79bab93c        249        12,948 System.Collections.Hashtable
0x01b231b8          2        14,728 System.Reflection.Cache.TypeNameStruct[]
0x0209c974        493        17,748 System.Configuration.ConfigXmlElement
0x79c044e8        713        19,964 System.ConfigNode
0x0209d044        790        22,120 System.Configuration.ConfigXmlAttribute
0x79ba36a4      1,533        24,528 System.Security.SecurityStringPair
0x020fb184        577        25,388 System.Text.RegularExpressions.RegexNode
0x0209d344      1,209        33,852 System.Configuration.ConfigXmlText
0x79bd4004      2,459        39,344 System.Collections.DictionaryEntry
0x79baaf78      3,328        39,936 System.Int32
0x000eb950         22        39,948      Free
0x0c3282c4      2,510        40,160 System.Web.Configuration.CapabilitiesPattern
0x0c32837c      2,170        43,400 System.Web.Configuration.CapabilitiesAssignment
0x0c172d5c      2,646        52,920 System.Text.RegularExpressions.GroupCollection
0x01b2236c         91        62,420 System.Char[]
0x01b22970        292        64,104 System.Collections.Hashtable/bucket[]
0x0c171f34      4,821       154,272 System.Text.RegularExpressions.Group
0x01b24550      8,252       184,424 System.Int32[][]
0x79ba2ee4      8,721       209,304 System.Collections.ArrayList
0x01b22c3c         99       227,404 System.Byte[]
0x01b226b0     24,364       537,272 System.Int32[]
0x0c17213c      8,250       561,000 System.Text.RegularExpressions.Match
0x01b2209c     15,054       786,464 System.Object[]
0x79b94638     23,874     2,833,720 System.String
Total 121,247 objects, Total size: 6,268,544

Dump 2:

0x79bd3134     33,632       403,584 System.Collections.CaseInsensitiveHashCodeProvider
0x0c324184     14,696       411,488 System.Web.UI.ControlCollection
0x79bc578c     14,742       412,776 System.EventHandler
0x02065a64     31,025       496,400 System.Web.Caching.CacheKey
0x020688cc      7,348       529,056 System.Web.HttpWriter
0x020fc378     22,045       529,080 System.Collections.Specialized.ListDictionary
0x0c323cbc      7,348       587,840 System.Web.UI.HtmlControls.HtmlForm
0x020fbd38     29,393       587,860 System.Collections.Specialized.HybridDictionary
0x0c3299b8      4,433       620,620 System.Web.UI.HtmlTextWriter/TagStackEntry[]
0x0c323ee4      7,348       676,016 System.Web.UI.ResourceBasedLiteralControl
0x0c324ae0      7,348       705,408 System.Web.UI.WebControls.Button
0x0c320f44     14,496       753,792 System.Web.HttpValueCollection
0x0c320a54     14,696       822,976 System.Web.HttpCookieCollection
0x02067d54      7,348       911,152 System.Web.HttpResponse
0x02066f18      7,348       940,544 System.Web.HttpContext
0x0c323534     14,696     1,116,896 System.Web.UI.LiteralControl
0x02069310      7,348     1,116,896 System.Web.HttpRequest
0x0c329638      4,433     1,117,116 System.Web.UI.HtmlTextWriter/RenderStyle[]
0x0c17ddf4      7,348     1,146,288 System.Web.HttpCachePolicy
0x0c320034      4,432     1,170,048 System.Web.UI.PageParser
0x02061fec      7,348     1,469,600 System.Web.Hosting.ISAPIWorkerRequestInProcForIIS6
0x0c329288      4,433     1,471,756 System.Web.UI.HtmlTextWriter/RenderAttribute[]
0x0213b78c      7,348     1,528,384 ASP.WebForm1_aspx
0x01b22c3c     54,679     1,637,500 System.Byte[]
0x01b2236c     29,578     1,696,148 System.Char[]
0x79ba2ee4     73,789     1,770,936 System.Collections.ArrayList
0x79bab93c     61,790     3,213,080 System.Collections.Hashtable
0x01b2209c    116,981     6,365,324 System.Object[]
0x01b22970     61,817     9,982,680 System.Collections.Hashtable/bucket[]
0x79b94638    212,747    17,647,648 System.String
Total 1,402,755 objects, Total size: 72,991,848

Ok, so the number of strings grew and the object[]s etc. but that is pretty normal, and simple types are pretty hard to track down since there are so many of them so the tactic is usually to look for growth of some more uncommon types…

In this case suddenly 7,348 ASP.WebForm1_aspx cropped up and along with it the same amount of System.Web.HttpRequests, System.Web.HttpContexts etc. and 14,696 System.Web.UI.LiteralControls so clearly there seems to be a link, i.e. one HttpRequest per WebForm1.aspx instance, and two LiteralControls for example. A pretty well founded assumption here would be that if only we could get these WebForm1_aspx objects to go away, the other objects would probably follow.

So why aren’t they garbage collected? Is the garbage collector not working???

There are two reasons an object is not collected.

  • The object is still in use or…
  • The object was in use the last time the garbage collector came to collect.

In most cases the problem is not that the garbage collector is not working, actually I would say I very very very rarely see such issues.

So why is the aspx page in use then? I can see in the browser that it finished executing, even closed the browser down…

Let’s dump some of the aspx pages (using the methodtable from the first column in !dumpheap –stat)

0:000> !dumpheap -mt 0x0213b78c
Using our cache to search the heap.
   Address         MT     Size  Gen
0x21c613c 0x213b78c      208    2 ASP.WebForm1_aspx
0x22268e0 0x213b78c      208    2 ASP.WebForm1_aspx
0x2227e44 0x213b78c      208    2 ASP.WebForm1_aspx
0x2229298 0x213b78c      208    2 ASP.WebForm1_aspx
0x222a6ec 0x213b78c      208    2 ASP.WebForm1_aspx
0x222bb40 0x213b78c      208    2 ASP.WebForm1_aspx
0x222cf94 0x213b78c      208    2 ASP.WebForm1_aspx
0x222e3e8 0x213b78c      208    2 ASP.WebForm1_aspx
0x222f83c 0x213b78c      208    2 ASP.WebForm1_aspx
0x2230c90 0x213b78c      208    2 ASP.WebForm1_aspx
0x22320e4 0x213b78c      208    2 ASP.WebForm1_aspx
0x2233538 0x213b78c      208    2 ASP.WebForm1_aspx
0x223509c 0x213b78c      208    2 ASP.WebForm1_aspx
0x22364f0 0x213b78c      208    2 ASP.WebForm1_aspx
0x2237944 0x213b78c      208    2 ASP.WebForm1_aspx
0x2238d98 0x213b78c      208    2 ASP.WebForm1_aspx
0x223a1ec 0x213b78c      208    2 ASP.WebForm1_aspx
0x223b640 0x213b78c      208    2 ASP.WebForm1_aspx
0x223ca94 0x213b78c      208    2 ASP.WebForm1_aspx
0x223dee8 0x213b78c      208    2 ASP.WebForm1_aspx
0x223f33c 0x213b78c      208    2 ASP.WebForm1_aspx
…

And pick an object at random (0x2227e44), and run the command !gcroot on it.

!gcroot will scan through all threads looking to see if our object exists on any stack pointers, then it will scan through the handle tables to see if anyone has a strong reference, ref count etc. on our object, or if it is pinned (all these mean that our object will stick around until the reference is removed). Finally it will scan through and see if perhaps our object is referenced by someone who has a strong reference, we call this a root chain.

If we do !gcroot on this object we get a hit in a handle table with a strong root to an object[] containing a MemoryIssues.MyClassThatHasEvents object that has a reference to a StuffHappenedEventHandler, that has a reference to another StuffHappenedEventHandler and so on, and at the end of the output we can see that one of the StuffHappenedEventHandlers has a reference to our WebForm1_aspx.

0:000> !gcroot 0x2227e44
Scan Thread 16 (0xb7c)
Scan Thread 20 (0xf90)
Scan Thread 22 (0xd48)
Scan HandleTable 0xd1008
Scan HandleTable 0xe86e8
Scan HandleTable 0x14a1f0
HANDLE(Strong):1f111d8:Root:0xa177000(System.Object[])->
0x21c6278(MemoryIssues.MyClassThatHasEvents)->
0x827ae84(MemoryIssues.StuffHappenedEventHandler)->
0x8277d98(MemoryIssues.StuffHappenedEventHandler)->
0x8274cac(MemoryIssues.StuffHappenedEventHandler)->
0x8271bc0(MemoryIssues.StuffHappenedEventHandler)->
0x826ead4(MemoryIssues.StuffHappenedEventHandler)->
0x826b9e8(MemoryIssues.StuffHappenedEventHandler)->
0x45f2424(MemoryIssues.StuffHappenedEventHandler)->
0x82688fc(MemoryIssues.StuffHappenedEventHandler)->
0x8265810(MemoryIssues.StuffHappenedEventHandler)->
… cut to save space …
->
0x2229d50(MemoryIssues.StuffHappenedEventHandler)->
0x6219930(MemoryIssues.StuffHappenedEventHandler)->
0x22288fc(MemoryIssues.StuffHappenedEventHandler)->
0x2227e44(ASP.WebForm1_aspx)

A few comments on this: The object[] we see is likely a list of statics, so it is very likely that our MemoryIssues.MyClassThatHasEvents is created as a static somewhere. The long linked list we see means that a number of subscribers have been hooked up to this delegate/event StuffHappenedEventHandler, I’ll explain more in the discussion on event handlers below.

So who is hooking up to this event? Well, probably some method on the WebForm1_aspx as it is showing up as a reference… but which one?

Let’s take a close look at one of these event handlers

0:000> !dumpobj 0x8265810
Name: MemoryIssues.StuffHappenedEventHandler
MethodTable 0x0213bb1c
EEClass 0x0215a630
Size 28(0x1c) bytes
GC Generation: 0
mdToken: 0x02000004  (c:\windows\microsoft.net\framework\v1.1.4322\temporary asp.net files\memoryissues\a66e6d9f\56815f47\assembly\dl2\d40d7b87\30299b80_bf1dc601\memoryissues.dll)
FieldDesc*: 0x00000000
        MT      Field     Offset                 Type       Attr      Value Name
0x79b97094 0x400004c      0xc         System.Int32   instance 34845811 _methodPtr
0x79b97094 0x400004d      0x4                CLASS   instance 0x08264500 _target
0x79b97094 0x400004e      0x8                CLASS   instance 0x00000000 _method
0x79b97094 0x400004f     0x10         System.Int32   instance 0 _methodPtrAux
0x79b9745c 0x4000050     0x14                CLASS   instance 0x045ef338 _prev

If we dump the target we will see what object subscribes to it and in this case, as expected we find WebForm1.aspx

0:000> !dumpobj 0x08264500
Name: ASP.WebForm1_aspx
MethodTable 0x0213b78c
EEClass 0x0215a214
Size 208(0xd0) bytes
GC Generation: 0
mdToken: 0x02000002  (c:\windows\microsoft.net\framework\v1.1.4322\temporary asp.net files\memoryissues\a66e6d9f\56815f47\v5hag6zs.dll)
FieldDesc*: 0x0213b688
        MT      Field     Offset                 Type       Attr      Value Name
0x020ffcc4 0x4000b40      0x4                CLASS   instance 0x00000000 _dataBindings
0x020ffcc4 0x4000b41      0x8                CLASS   instance 0x00000000 _id
0x020ffcc4 0x4000b42      0xc                CLASS   instance 0x00000000 _cachedUniqueID
0x020ffcc4 0x4000b43     0x10                CLASS   instance 0x00000000 _parent
0x020ffcc4 0x4000b44     0x14                CLASS   instance 0x00000000 _site
0x020ffcc4 0x4000b45     0x18                CLASS   instance 0x082657d4 _events
0x020ffcc4 0x4000b46     0x1c                CLASS   instance 0x08265284 _controls
0x020ffcc4 0x4000b47     0x38         System.Int32   instance 5 _controlState
0x020ffcc4 0x4000b48     0x20                CLASS   instance 0x00000000 _renderMethod
0x020ffcc4 0x4000b49     0x24                CLASS   instance 0x00000000 _viewState
0x020ffcc4 0x4000b4a     0x28                CLASS   instance 0x00000000 _controlsViewState
…

To get the method that is hooked up as a handler, convert the int value in _methodPtr to hex

0:000> ?0n34845811
Evaluate expression: 34845811 = 0213b473

And run !ip2md (instruction pointer to method descriptor) to find out the name of the function

0:000> !ip2md 0213b473
0x0213b473 not in jit code range

Sometimes you get lucky and get the method name straight away but in this case the method could not be found. This is often the case for these event handlers, especially when the method is defined in the code behind class rather in the jitted class. In that case we have to find the objects (WebForm1.aspx) parent and dump out the methods it defines to find the method.

Taking the eeclass from the target and running !dumpclass on it we can get the parent class (i.e. our code behind class)

0:000> !dumpobj 0x08264500
Name: ASP.WebForm1_aspx
MethodTable 0x0213b78c
EEClass 0x0215a214
…
0:000> !dumpclass 0x0215a214
Class Name : ASP.WebForm1_aspx
mdToken : 0x02000002 (c:\windows\microsoft.net\framework\v1.1.4322\temporary asp.net files\memoryissues\a66e6d9f\56815f47\v5hag6zs.dll)
Parent Class : 0x0215a1b0
ClassLoader : 0x01f698b8
Method Table : 0x0213b78c
Vtable Slots : 0x4f
Total Method Slots : 0x55

And then we can run !dumpclass on it to get the method table etc. that we can use to find it’s methods

0:000> !dumpclass 0x0215a1b0
Class Name : MemoryIssues.WebForm1
mdToken : 0x02000003 (c:\windows\microsoft.net\framework\v1.1.4322\temporary asp.net files\memoryissues\a66e6d9f\56815f47\assembly\dl2\d40d7b87\30299b80_bf1dc601\memoryissues.dll)
Parent Class : 0x02114794
ClassLoader : 0x0016dd90
Method Table : 0x0213b4cc
Vtable Slots : 0x4f
Total Method Slots : 0x55
Class Attributes : 0x100001 :

And we can see that the parent is of type MemoryIssues.WebForm1 (the class defined in the code behind), let’s dump out the methods it has defined. (the –md switch for !dumpmt gives us a list of all the methods that are implemented for a given class)

0:000> !dumpmt -md 0x0213b4cc
EEClass : 0x0215a1b0
Module : 0x01f54478
Name: MemoryIssues.WebForm1
mdToken: 0x02000003  (c:\windows\microsoft.net\framework\v1.1.4322\temporary asp.net files\memoryissues\a66e6d9f\56815f47\assembly\dl2\d40d7b87\30299b80_bf1dc601\memoryissues.dll)
BaseSize: 0xcc
MethodTable Flags : 0xc0000
Number of IFaces in IFaceMap : 0x6
Interface Map : 0x0213b650
Slots in VTable : 85
--------------------------------------
MethodDesc Table
  Entry  MethodDesc   JIT   Name
0x79b9300b 0x79b93010    None   [DEFAULT] [hasThis] String System.Object.ToString()
0x799abbe8 0x79b93020    PreJIT [DEFAULT] [hasThis] Boolean System.Object.Equals(Object)
0x7999f680 0x79b93050    PreJIT [DEFAULT] [hasThis] I4 System.Object.GetHashCode()
0x79997c00 0x79b93070    PreJIT [DEFAULT] [hasThis] Void System.Object.Finalize()
0x0c1ff358 0x020ff578    JIT    [DEFAULT] [hasThis] String System.Web.UI.Control.get_ClientID()

.. cut to save space …

0x0c170bc3 0x0c170bc8    None   [DEFAULT] [hasThis] Boolean System.Web.UI.Page.get_IsReusable()
0x0213b443 0x0213b448    None   [DEFAULT] [hasThis] Void MemoryIssues.WebForm1.Page_Load(Object,Class System.EventArgs)
0x0213b463 0x0213b468    None   [DEFAULT] [hasThis] Void MemoryIssues.WebForm1.InitializeComponent()
0x0213b473 0x0213b478    None   [DEFAULT] [hasThis] Void MemoryIssues.WebForm1.MyStaticObject_StuffHappened(Object,Class System.EventArgs)
0x0213b483 0x0213b488    None   [DEFAULT] [hasThis] Void MemoryIssues.WebForm1.Button1_Click(Object,Class System.EventArgs)
0x0213b493 0x0213b498    None   [DEFAULT] Void MemoryIssues.WebForm1..cctor()
0x0213b4a3 0x0213b4a8    None   [DEFAULT] [hasThis] Void MemoryIssues.WebForm1..ctor()

We are looking for a method with the address 0x0213b473. In the list of methods it defines we can see that the method address 0x0213b473 corresponds to the MemoryIssues.WebForm1.MyStaticObject_StuffHappened method, so the method MyStaticObject_StuffHappened is hooked up as an event handler for the StuffHappened event of the static MyClassThatHasEvents object, and it’s never removed.

This in turn means that since the event in the static class holds a reference to its handlers, and a static object never goes away, the handlers will stick around for ever and the event will reference WebForm1 forever, so it and its member variables will never be collected. Talk about a quick way to get a memory leak:)

And, if you’re curious about the classic question “if a tree falls in the forest and no one is there to see it does it still make a sound?” (or something like that:), the answer is yes… if the event is ever fired, all these event handlers will be called, so not only do you have yourself a nice little memory leak, depending on what the event handler does, you might also have yourself a nice little performance problem.

The question I always ask myself in these cases is “what was the event that we tried to handle in the first place”. In most cases if you step back a little it doesn’t make sense to handle events like this in web applications. But in the cases where it does, the resolution to the problem is to remove the handler when you are done and make sure that it is removed in a function that you know for 100% sure that it’s called.

If you are interested, here is the code that caused the problem

public class WebForm1 : System.Web.UI.Page
{
  public static MyClassThatHasEvents MyStaticObject = new MyClassThatHasEvents();

  private void InitializeComponent()
  {
    this.Load += new System.EventHandler(this.Page_Load);
    MyStaticObject.StuffHappened += new StuffHappenedEventHandler(this.MyStaticObject_StuffHappened);
  }
}

But the same kind of problems occur if you hook up event handlers in your page to events for session variables for example, except that in those cases the pages will be released when the session expires as that is when the session object is released.

A short discussion on event handlers

An event is really a linked list of event handlers as you can see from dumping out the event handlers above (each node contains the target, method and a prev pointer).

When you do += new EventHandler on the event it doesn’t really matter if this particular function has been added as a listener before, it will get added once per +=.

When the event is raised we go through the linked list, item by item and call all the methods (event handlers) added to this list, this is why the event handlers are still called even when the pages are no longer running as long as they are alive (rooted), and they will be alive as long as they are hooked up. So they will get called until the event handler is unhooked with a -= new EventHandler.

In VB.Net you hook up events with AddHandler or with the Handles keyword, both hook up the events the same way. The equivalent to the -= new EventHandler… in vb.net, is RemoveHandler.

The moral of the story here is: be very careful if you are hooking up event handlers to objects that have a longer lifespan than the page.