10 minute read

We have reached the end of the .NET Debugging Labs series. And we are going to end it with a walkthrough of the last memory leak lab.

I am interested to get feedback, either in comments or through email in the contact me section, on what you liked, what you didn’t like. If you felt that it was worth your while etc. so I know if there is an interest in possibly doing similar lab series in the future. It will probably be a while before I would post a new series but if there is interest at least I can start thinking about it. If that is the case, let me know what you thought was missing from this one so that I can put some good scenarios in the next one.

Previous labs and setup instructions

If you are new to the debugging labs, here you can find information on how to set up the labs as well as links to the previous labs in the series.

Review the performance counters to figure out what we are leaking

  1. Check Working Set, GC Heap Size and Number of Assemblies Loaded

    • Can you tell if the issue we are facing is a virtual bytes leak, a native leak or a .NET leak?

      The numbers seem to follow each other so it looks like a .net memory issue

Debug the memory dump

  1. Open the memory dump in WinDbg, load up the symbols and load sos.dll (see information and setup instructions for more info)

    • What is the size of the memory dump (on disk)

      ~770 MB

  2. Run !eeheap -gc and !dumpheap -stat

     0:000> !eeheap -gc
     Number of GC Heaps: 8
     ------------------------------
     Heap 0 (000001EF5C8E8EE0)
     generation 0 starts at 0x000001EF5CD66400
     generation 1 starts at 0x000001EF5CCA1018
     generation 2 starts at 0x000001EF5CCA1000
     ephemeral segment allocation context: none
             segment             begin         allocated              size
     000001EF5CCA0000  000001EF5CCA1000  000001EF5CD7A418  0xd9418(889880)
     Large object heap starts at 0x000001F35CCA1000
             segment             begin         allocated              size
     000001F35CCA0000  000001F35CCA1000  000001F35FB9AEC0  0x2ef9ec0(49258176)
     Heap Size:       Size: 0x2fd32d8 (50148056) bytes.
     ------------------------------
     Heap 1 (000001EF5C9100E0)
     generation 0 starts at 0x000001EFDCD02A08
     generation 1 starts at 0x000001EFDCCA1018
     generation 2 starts at 0x000001EFDCCA1000
     ephemeral segment allocation context: none
             segment             begin         allocated              size
     000001EFDCCA0000  000001EFDCCA1000  000001EFDCEF4A20  0x253a20(2439712)
     Large object heap starts at 0x000001F36CCA1000
             segment             begin         allocated              size
     000001F36CCA0000  000001F36CCA1000  000001F36FB2CBC8  0x2e8bbc8(48806856)
     Heap Size:       Size: 0x30df5e8 (51246568) bytes.
     ------------------------------
     ...
     ------------------------------
     Heap 7 (000001F3E86D84F0)
     generation 0 starts at 0x000001F2DCD80CA8
     generation 1 starts at 0x000001F2DCCA1018
     generation 2 starts at 0x000001F2DCCA1000
     ephemeral segment allocation context: none
             segment             begin         allocated              size
     000001F2DCCA0000  000001F2DCCA1000  000001F2DCD8ECC0  0xedcc0(974016)
     Large object heap starts at 0x000001F3CCCA1000
             segment             begin         allocated              size
     000001F3CCCA0000  000001F3CCCA1000  000001F3CFB8EAB8  0x2eedab8(49207992)
     Heap Size:       Size: 0x2fdb778 (50182008) bytes.
     ------------------------------
     GC Heap Size:    Size: 0x18c53f18 (415579928) bytes.
    
    • What is the size of the .NET heap according to !eeheap -gc?

      Around 415 MB - so quite a bit of the memory goes here

    • Is most of the memory stored on the regular heaps or on the large object heap?

      Looks like the large object heap is very heavily used

    We saw from the performance counters that we appeared to be leaking .net memory, so the next step is to determine what the memory is used for.

  3. Run !dumpheap -stat

     0:000> !dumpheap -stat
     Statistics:
                 MT    Count    TotalSize Class Name
     ...
     00007ffcfcadbe00     1002       104208 BuggyBits.Controllers.NewsController
     00007ffcfc3997a8     1472       105984 System.Collections.Generic.Dictionary`2[[System.String, System.Private.CoreLib],[System.Object, System.Private.CoreLib]]
     00007ffcfcc509f0     1698       108672 System.Collections.Generic.HashSet`1[[System.Object, System.Private.CoreLib]]
     00007ffcfcd62400     1849       118336 Microsoft.AspNetCore.Mvc.Routing.UrlActionContext
     00007ffcfcd48fd0     3007       120280 Microsoft.Extensions.Internal.CopyOnWriteDictionary`2[[System.Object, System.Private.CoreLib],[System.Object, System.Private.CoreLib]]
     00007ffcfcc687c0     2556       122688 System.Collections.Concurrent.ConcurrentDictionary`2+Node[[System.Object, System.Private.CoreLib],[Microsoft.Extensions.Caching.Memory.CacheEntry, Microsoft.Extensions.Caching.Memory]]
     00007ffcfce031a0     1421       125048 System.Diagnostics.Tracing.IncrementingCounterPayload
     00007ffcfccad320     3938       126016 Microsoft.AspNetCore.Mvc.ViewFeatures.Buffers.ViewBufferPage
     00007ffcfcba0298     1002       128256 Microsoft.AspNetCore.Http.DefaultHttpContext
     00007ffcfcd60768     1853       129800 Microsoft.AspNetCore.Mvc.ViewFeatures.Buffers.ViewBufferPage[]
     00007ffcfcd49dd8     3488       139520 Microsoft.AspNetCore.Razor.TagHelpers.TagHelperAttribute
     00007ffcfcb6fd10      232       142912 Microsoft.AspNetCore.Server.IIS.Core.IISHttpContextOfT`1[[Microsoft.AspNetCore.Hosting.HostingApplication+Context, Microsoft.AspNetCore.Hosting]]
     000001ef5c3a6b70     4825       145952      Free
     00007ffcfcd46dc0     2325       148800 System.Func`1[[System.Threading.Tasks.Task, System.Private.CoreLib]]
     00007ffcfcbbafc0     3468       155344 Microsoft.AspNetCore.Routing.RouteEndpoint[]
     00007ffcfc4c8f58     1852       162800 System.Collections.Generic.KeyValuePair`2[[System.String, System.Private.CoreLib],[System.String, System.Private.CoreLib]][]
     00007ffcfcd68c80     1851       162824 Microsoft.AspNetCore.Routing.Tree.OutboundMatchResult[]
     00007ffcfcd601d0     4168       166720 Microsoft.AspNetCore.Razor.TagHelpers.DefaultTagHelperContent
     00007ffcfc3f0db0     1747       181688 System.Reflection.RuntimeMethodInfo
     00007ffcfcc67a68     1007       193344 Microsoft.Extensions.Caching.Memory.CacheEntry
     00007ffcfcaed2e0     3936       220416 Microsoft.AspNetCore.Mvc.ViewFeatures.Buffers.ViewBuffer
     00007ffcfcadde08     1856       222720 Microsoft.AspNetCore.Mvc.TagHelpers.AnchorTagHelper
     00007ffcfc2d6618      560       224136 System.Object[]
     00007ffcfc392aa8     3873       233464 System.SByte[]
     00007ffcfcdfb628     2436       253344 System.Diagnostics.Tracing.CounterPayload
     00007ffcfcc88000     3934       314720 Microsoft.AspNetCore.Mvc.ViewFeatures.Buffers.ViewBufferTextWriter
     00007ffcfc3d3058      995       358016 System.Char[]
     00007ffcfc9e4fa8     9122       364880 Microsoft.AspNetCore.Routing.RouteValueDictionary
     00007ffcfcb53bf8     8922       691920 System.Collections.Generic.KeyValuePair`2[[System.String, System.Private.CoreLib],[System.Object, System.Private.CoreLib]][]
     00007ffcfc392360      457       731633 System.Byte[]
     00007ffcfc391e18    16450      1108462 System.String
     00007ffcfc3af090     2352    400900028 System.Int32[]
    
    • What types of objects seem to use up most of the memory?

      out of the approximately 417 MB, 400 MB are locked up in Int32 arrays, so if we could just figure out what is causing these (or the largest of these arrays) to stick around we would probably resolve the issue.

    • Looking at the 10-20 bottommost object types in !dumpheap -stat, can you see any patterns among the objects? I.e. can they be grouped in any way?

      Most of the 10-20 bottommost object types are UI or request related, i.e. objects that you would typically see hooked up to an asp.net page or an asp.net request.

    Normally you will be able to see patterns such as many data related objects, many ui related objects, many xml related objects etc. like in this memory investigation

    • Looking at the 10-20 bottom most object types in !dumpheap -stat, do the quantities of each of them seem normal or does anything seem out of the ordinary?

      The most obvious one is the Int32 arrays, but the fact that we have 1002 NewsController objects is also very curious. We ran the test with 1000 requests, and I had browsed to the News part once or twice before running the test to start up the iisexpress.exe process so it looks like all the requests are still sticking around. Normally you should expect to see only the requests that are currently executing stick around or perhaps a few more, but definitely not this many. If I run ~* e !clrstack to see all .net stacks I have no requests currently executing.

  4. Dump the large objects using !dumpheap -min 0n85000

     0:000> !dumpheap -min 0n85000
             Address               MT     Size
     ...
     000001f3cfa69ab0 00007ffcfc3af090   400024
     000001f3cfacb568 00007ffcfc3af090   400024
     000001f3cfb2d020 00007ffcfc3af090   400024
    
     Statistics:
                 MT    Count    TotalSize Class Name
     00007ffcfc392360        1       131096 System.Byte[]
     00007ffcfc3af090     1002    400824048 System.Int32[]
     Total 1003 objects
    
    • What type of objects are stored on the Large Object Heap (LOH)?

      Most (or all in this case) LOH objects are Int32 arrays and they all seem to have approximately the same size. What is even more interesting is that there are 1002 of them (same number as our aspx pages) so they could be connected somehow.

  5. Take any of the objects on the LOH and run !gcroot <object address> to find out where they are rooted

     0:000> !gcroot 000001f3cfb2d020
     Thread 5560:
         0000007B1C5FF050 00007FFD5B7A3542 System.Threading.Tasks.Task.SpinThenBlockingWait(Int32, System.Threading.CancellationToken) [/_/src/System.Private.CoreLib/shared/System/Threading/Tasks/Task.cs @ 2922]
             rbp+10: 0000007b1c5ff0c0
                 ...
                 ->  000001F15CD231D8 System.Collections.Concurrent.ConcurrentDictionary`2+Tables[[System.Object, System.Private.CoreLib],[Microsoft.Extensions.Caching.Memory.CacheEntry, Microsoft.Extensions.Caching.Memory]]
                 ->  000001F15CD103F8 System.Collections.Concurrent.ConcurrentDictionary`2+Node[[System.Object, System.Private.CoreLib],[Microsoft.Extensions.Caching.Memory.CacheEntry, Microsoft.Extensions.Caching.Memory]][]
                 ->  000001F0DCE337A0 System.Collections.Concurrent.ConcurrentDictionary`2+Node[[System.Object, System.Private.CoreLib],[Microsoft.Extensions.Caching.Memory.CacheEntry, Microsoft.Extensions.Caching.Memory]]
                 ->  000001F0DCE33450 Microsoft.Extensions.Caching.Memory.CacheEntry
                 ->  000001F0DCE33690 System.Collections.Generic.List`1[[Microsoft.Extensions.Caching.Memory.PostEvictionCallbackRegistration, Microsoft.Extensions.Caching.Abstractions]]
                 ->  000001F0DCE336B0 Microsoft.Extensions.Caching.Memory.PostEvictionCallbackRegistration[]
                 ->  000001F0DCE33670 Microsoft.Extensions.Caching.Memory.PostEvictionCallbackRegistration
                 ->  000001F0DCE33630 Microsoft.Extensions.Caching.Memory.PostEvictionDelegate
                 ->  000001F0DCE33140 BuggyBits.Controllers.NewsController
                 ->  000001F3CFB2D020 System.Int32[]
    
     Found 1 unique roots (run '!gcroot -all' to see all roots).
    
    • What types of roots would you normally see when running !gcroot and what do they mean?

      GCRoot will check all “roots” in the process and if the object you are looking for is in any of the “root chains” it will list this root chain.

    Look at the root chain of the object that you did !gcroot on.

    • Why is it sticking around? How could the root chain be broken?

      The Int32[] is a member variable of a news_aspx page, and this in turn is a member variable of a CacheItemRemovedCallback which is referenced by a CacheEntry so as long as the cache entry exists and the link is not severed the object will stay around. The question is really what is a CacheItemRemovedCallback and why is the page referenced there…

      Let’s look at the objects involved in a bit more detail…

        0:000> !do 000001F0DCE33450
        Name:        Microsoft.Extensions.Caching.Memory.CacheEntry
        MethodTable: 00007ffcfcc67a68
        EEClass:     00007ffcfcc735b0
        Size:        192(0xc0) bytes
        File:        C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App\3.1.11\Microsoft.Extensions.Caching.Memory.dll
        Fields:
                    MT    Field   Offset                 Type VT     Attr            Value Name
        ...
        00007ffcfc2d0af0  4000010       38        System.Object  0 instance 000001f0dce336e8 _value
        ...
        00007ffcfc2d0af0  4000014       48        System.Object  0 instance 000001f0dce333b0 <Key>k__BackingField
        ....
        00007ffcfc49b420  4000004       10 ...Private.CoreLib]]  0   static 000001f25cd15500 ExpirationCallback
        0:000> !DumpObj /d 000001f0dce336e8
        Name:        System.String
        MethodTable: 00007ffcfc391e18
        EEClass:     00007ffcfc382150
        Size:        78(0x4e) bytes
        File:        C:\Program Files\dotnet\shared\Microsoft.NETCore.App\3.1.11\System.Private.CoreLib.dll
        String:      New site launched 2008-02-02
        Fields:
                    MT    Field   Offset                 Type VT     Attr            Value Name
        00007ffcfc2db1f0  4000242        8         System.Int32  1 instance               28 _stringLength
        00007ffcfc2d8008  4000243        c          System.Char  1 instance               4e _firstChar
        00007ffcfc391e18  4000244      110        System.String  0   static 000001f2dcca1360 Empty
        0:000> !DumpObj /d 000001f0dce333b0
        Name:        System.String
        MethodTable: 00007ffcfc391e18
        EEClass:     00007ffcfc382150
        Size:        94(0x5e) bytes
        File:        C:\Program Files\dotnet\shared\Microsoft.NETCore.App\3.1.11\System.Private.CoreLib.dll
        String:      3f98f161-6b7a-4144-8ba6-6f2f11892501
        Fields:
                    MT    Field   Offset                 Type VT     Attr            Value Name
        00007ffcfc2db1f0  4000242        8         System.Int32  1 instance               36 _stringLength
        00007ffcfc2d8008  4000243        c          System.Char  1 instance               33 _firstChar
        00007ffcfc391e18  4000244      110        System.String  0   static 000001f2dcca1360 Empty
      

      The CacheEntry has the cache key 3f98f161-6b7a-4144-8ba6-6f2f11892501 and the value New site launched 2008-02-02.

      Looking at the NewsController we find

        public IActionResult Index()
        {
            string key = Guid.NewGuid().ToString();
            var cachedResult = cache.GetOrCreate(key, cacheEntry => {
                cacheEntry.SlidingExpiration = TimeSpan.FromMinutes(5);
                cacheEntry.RegisterPostEvictionCallback(CacheRemovedCallback);
                cacheEntry.Priority = CacheItemPriority.NeverRemove;
      
                return new string("New site launched 2008-02-02");
            });
      
            var news = new List<News>
            {
                new News() { Title = cachedResult }
            };
            return View(news);
        }
      

      The next item in the root chain was the PostEvictionDelegate and that is linked to the CacheEntry as the _target, looking at the code above we see that we do in fact register a PostEvictionCallback in the Cache.Add method, so that explains why it is linked like this.

        0:000> !do 000001F0DCE33630
        Name:        Microsoft.Extensions.Caching.Memory.PostEvictionDelegate
        MethodTable: 00007ffcfcdd9450
        EEClass:     00007ffcfcdce548
        Size:        64(0x40) bytes
        File:        C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App\3.1.11\Microsoft.Extensions.Caching.Abstractions.dll
        Fields:
                    MT    Field   Offset                 Type VT     Attr            Value Name
        00007ffcfc2d0af0  400012a        8        System.Object  0 instance 000001f0dce33140 _target
        00007ffcfc2d0af0  400012b       10        System.Object  0 instance 0000000000000000 _methodBase
        00007ffcfc2decc8  400012c       18        System.IntPtr  1 instance 00007FFCFCA0A1E8 _methodPtr
        00007ffcfc2decc8  400012d       20        System.IntPtr  1 instance 0000000000000000 _methodPtrAux
        00007ffcfc2d0af0  40001bc       28        System.Object  0 instance 0000000000000000 _invocationList
        00007ffcfc2decc8  40001bd       30        System.IntPtr  1 instance 0000000000000000 _invocationCount
      

      The target member variable points to a NewsController. The _target member variable for a delegate/event handler is the object instance that should be used to call the event handler.

      Going back to the code above, the method that handles the PostEvictionCallback (i.e. the function to be called when the cached object is removed from cache) is this.CacheRemovedCallback. In other words, our target is _this (which is the NewsController itself, and the _methodPtr should point to CacheRemovedCallback. So the Controller will be linked to the cacheEntry as long as it is still in cache in order to be able to call the CacheRemovedCallback when it goes out of scope.

      Finally - the Int32[] is just a member variable of the NewsController

Putting it all together and determining the cause of the memory leak

  1. Look at the code for the News controller and use the knowledge you gathered from the debugging session to figure out how the leak was generated.

    • How long will the objects stick around?

      Until the object is dropped from cache, in this case 5 minutes

    • What can you do to avoid this type of event handler leak?

      Any time you hook up an event handler to a static object or a long lived object it is better to hook up a static method as the event handler rather than an instance method. That way there is no object that needs to stick around for the event handler to still be valid.

      In this particular case, every time we hit the News page and add the item to cache we will “store” an instance of the NewsController in cache and everything it references. The easy solution here would be to make the CacheRemovedCallback static and call NewsController.CacheRemovedCallback rather than this.CacheRemovedCallback.

      I have described this particular issue and another similar one in the following articles:

      You should also be careful to not store web controls or any other objects in cache that may have references that you are not aware of. Storing a web control in cache for example will also store a page in cache since the web control usually has a _parent property that points to the aspx page.

Have fun, Tess