ASP.NET Case Study: Tracing your way to Out Of Memory Exceptions

7 minute read

One of my colleagues asked me to look at a memory dump for a customer and I thought that was I found was fairly interesting so here is the story.

Problem description

Memory in our ASP.NET application (w3wp.exe) keeps growing and growing and requests become slower and slower as time goes by.

Speaking of which, my machine is crawling to a halt right now. My problem is that I’m a collector which had seriously negative effects when moving houses because of all the stuff I hadn’t used in 7 years that we had to carry along in case we might need it some day :), and also on my machine, because i constantly have 20+ dumps open because I know that the moment I’ll close one down, I’ll get a follow up question on something I found in the dump… and reopening and redoing is just a royal pain in the you know what.

As we shall soon see, that was the problem of this process too… the question is just, what does it like to collect?

Debugging the issue

My colleague had already done 99% of the job here, I love those issues, I don’t have to do the grunt work and I get to deliver the punch line :) yay!!!

I don’t want to cut to the finish line quite yet, but the issue was super easy to repro, so all the debug data is from my machine where I have made ~10000 requests to an aspx page that does pretty much nothing (just displays a button and a label) and I managed to rack up ~800 MB of private bytes before I grabbed a dump with adplus -hang -pn w3wp.exe

!eeheap -gc shows that most of the memory is .NET memory and in fact for a change, most of the is on the small object heap (i.e. we are not creating a ton of objects over 85000 bytes)

0:029> !eeheap -gc
Number of GC Heaps: 2
Heap 0 (001b1308)
generation 0 starts at 0x30eb5b40
generation 1 starts at 0x302d0038
generation 2 starts at 0x02b20038
ephemeral segment allocation context: none
segment begin allocated size
001cd4c8 7a72175c 7a742464 0x00020d08(134408)
001bd0c8 790d6408 790f5988 0x0001f580(128384)
02b20000 02b20038 069232c8 0x03e03290(65024656)
102d0000 102d0038 13f6787c 0x03c97844(63535172)
1c2d0000 1c2d0038 1fe776c8 0x03ba7690(62551696)
202d0000 202d0038 23b2bb38 0x0385bb00(59095808)
282d0000 282d0038 2c2941c8 0x03fc4190(66863504)
302d0000 302d0038 32db5b4c 0x02ae5b14(44981012)
Large object heap starts at 0x0ab20038
segment begin allocated size
0ab20000 0ab20038 0ab2e9e8 0x0000e9b0(59824)
Heap Size 0x15996540(362374464)
Heap 1 (001b22c0)
generation 0 starts at 0x34f681e4
generation 1 starts at 0x345e0038
generation 2 starts at 0x06b20038
ephemeral segment allocation context: none
segment begin allocated size
06b20000 06b20038 0aa00388 0x03ee0350(65930064)
142d0000 142d0038 17eded70 0x03c0ed38(62975288)
182d0000 182d0038 1bb90780 0x038c0748(59508552)
242d0000 242d0038 27df6470 0x03b26438(62022712)
2c2d0000 2c2d0038 2ed482bc 0x02a78284(44532356)
345e0000 345e0038 36b3cff4 0x0255cfbc(39178172)
Large object heap starts at 0x0cb20038
segment begin allocated size
0cb20000 0cb20038 0cb20048 0x00000010(16)
Heap Size 0x13eaae58(334147160)
GC Heap Size 0x29841398(696521624)

If we run !dumpheap -stat we find that most of the objects on the .net heap are data related. Now this is super weird… If I hadn’t performed the tests myself I would have never believed it if someone told me that the only thing executed in this process at the time of the test was a very simple aspx page with just a label and a button, but here we are now, with a huge amount of DataTables and related objects…

0:029> !dumpheap -stat
Heap 0
total 3649512 objects
Heap 1
total 3350728 objects
total 7000240 objects
MT Count TotalSize Class Name
79166c44 58164 1395936 System.Collections.BitArray
7a750494 48470 1551040 System.ComponentModel.CollectionChangeEventHandler
653bf7dc 116340 1861440 System.Data.DataRowBuilder
653cc1c0 48470 2132680 System.Data.Common.Int32Storage
653bf378 116340 2326800 System.Data.DataRowCollection
7919f730 116340 2792160 System.Collections.Generic.List`1[[System.Int32, mscorlib]]
653fae30 116340 2792160 System.Collections.Generic.List`1[[System.Data.DataView, System.Data]]
653fabd0 116340 2792160 System.Collections.Generic.List`1[[System.Data.DataViewListener, System.Data]]
653f6854 116340 2792160 System.Collections.Generic.List`1[[System.Data.Index, System.Data]]
653ce3e8 126034 3528952 System.Data.RBTree`1+TreePage[[System.Data.DataRow, System.Data]]
653bf1c8 116340 3722880 System.Data.ConstraintCollection
653bf12c 116340 3722880 System.Data.RecordManager
653bf440 116340 5118960 System.Data.DataRowCollection+DataRowTree
79103c08 116343 5119092 System.Threading.ReaderWriterLock
653bb48c 116340 6515040 System.Data.DataColumnCollection
7910374c 283643 6807432 System.Collections.ArrayList
790feb28 132876 7441056 System.Collections.Hashtable
653c5de0 203574 8957256 System.Data.Common.StringStorage
7912c4ac 9694 10042984 System.Double[]
79111248 640802 17942456 System.Collections.ArrayList+ArrayListEnumeratorSimple
653eee74 20191 20917876 System.Data.RBTree`1+Node[[System.Int32, mscorlib]][]
7912439c 133676 21046920 System.Collections.Hashtable+bucket[]
791241a0 430722 32552836 System.Int32[]
653bc78c 116340 34436640 System.Data.DataTable
790fa490 664580 48172328 System.String
653bdbc0 959706 61421184 System.Data.DataRow
653bbf48 416885 61698980 System.Data.DataColumn
653ce258 126034 130571224 System.Data.RBTree`1+Node[[System.Data.DataRow, System.Data]][]
791242d8 599163 166706116 System.Object[]
Total 7000240 objects

Anyways, let’s do the usual stuff… get a couple of these data tables and run !gcroot on them to see why they are hanging around.

0:000> !dumpheap -mt 653bc78c
Using our cache to search the heap.
Address MT Size Gen
02b74560 653bc78c 296 2 System.Data.DataTable
02b7592c 653bc78c 296 2 System.Data.DataTable
02b82448 653bc78c 296 2 System.Data.DataTable
02b82e70 653bc78c 296 2 System.Data.DataTable
02b8396c 653bc78c 296 2 System.Data.DataTable
02b84718 653bc78c 296 2 System.Data.DataTable
02b85488 653bc78c 296 2 System.Data.DataTable
02b85eb0 653bc78c 296 2 System.Data.DataTable
02b868d8 653bc78c 296 2 System.Data.DataTable
02b8726c 653bc78c 296 2 System.Data.DataTable
02b87c00 653bc78c 296 2 System.Data.DataTable
02b885e4 653bc78c 296 2 System.Data.DataTable
02b88f78 653bc78c 296 2 System.Data.DataTable
02b89c00 653bc78c 296 2 System.Data.DataTable
02b8a680 653bc78c 296 2 System.Data.DataTable
02b8af98 653bc78c 296 2 System.Data.DataTable
02b8b9c0 653bc78c 296 2 System.Data.DataTable
02b8c3e8 653bc78c 296 2 System.Data.DataTable
02b8cfd8 653bc78c 296 2 System.Data.DataTable
02b8e294 653bc78c 296 2 System.Data.DataTable
02b8ecbc 653bc78c 296 2 System.Data.DataTable
02b8f6e4 653bc78c 296 2 System.Data.DataTable
02b9010c 653bc78c 296 2 System.Data.DataTable
05721020 653bc78c 296 2 System.Data.DataTable
05721a48 653bc78c 296 2 System.Data.DataTable
057223dc 653bc78c 296 2 System.Data.DataTable
05722d70 653bc78c 296 2 System.Data.DataTable
05723754 653bc78c 296 2 System.Data.DataTable
057240e8 653bc78c 296 2 System.Data.DataTable
05724d70 653bc78c 296 2 System.Data.DataTable
0572dc64 653bc78c 296 2 System.Data.DataTable
0572ea10 653bc78c 296 2 System.Data.DataTable
0572fbf8 653bc78c 296 2 System.Data.DataTable
05730620 653bc78c 296 2 System.Data.DataTable
05731048 653bc78c 296 2 System.Data.DataTable
05731a70 653bc78c 296 2 System.Data.DataTable
05732498 653bc78c 296 2 System.Data.DataTable
05732e2c 653bc78c 296 2 System.Data.DataTable
057337c0 653bc78c 296 2 System.Data.DataTable
057341a4 653bc78c 296 2 System.Data.DataTable

!gcroot shows the following root chain

0:000> !gcroot 057337c0
DOMAIN(001DCC80):HANDLE(Strong):29211d0:Root: 02b35be0(System.Web.NativeFileChangeNotification)->

So the DataTables/DataSets are rooted in some type of Queue in a System.Web.Util.Profiler in a System.Web.HttpRuntime object, in other words it is fair to guess that these data tables have something to do with profiling.

If we dump out this data table and check out the table name we find that it is Trace_Form_Collection, a clue Sherlock :)

0:020> !do 057337c0
Name: System.Data.DataTable
MethodTable: 653bc78c
EEClass: 653bc33c
Size: 296(0x128) bytes
GC Generation: 2
MT Field Offset Type VT Attr Value Name
7a745be4 400099c 4 ...ponentModel.ISite 0 instance 00000000 site
7a742e2c 400099d 8 ....EventHandlerList 0 instance 00000000 events
790f9cc8 400099b 1e4 System.Object 0 shared static EventDisposed
>> Domain:Value 001aac68:NotInit 001dcc80:NotInit <<
653baee4 4000722 c System.Data.DataSet 0 instance 0572db78 dataSet
653bed94 4000723 10 System.Data.DataView 0 instance 00000000 defaultView
790fedd4 4000724 d0 System.Int32 1 instance 1 nextRowID
653bf378 4000725 14 ...DataRowCollection 0 instance 05733acc rowCollection
653bb48c 4000726 18 ...aColumnCollection 0 instance 0573397c columnCollection
653bf1c8 4000727 1c ...straintCollection 0 instance 05733a94 constraintCollection
790fedd4 4000728 d4 System.Int32 1 instance 2 elementColumnCount
653cbda4 4000729 20 ...elationCollection 0 instance 00000000 parentRelationsCollection
653cbda4 400072a 24 ...elationCollection 0 instance 00000000 childRelationsCollection
653bf12c 400072b 28 ...ata.RecordManager 0 instance 05733944 recordManager
00000000 400072c 2c 0 instance 05733fe4 indexes
00000000 400072d 30 0 instance 00000000 shadowIndexes
790fedd4 400072e d8 System.Int32 1 instance 0 shadowCount
653cc2b0 400072f 34 ...ropertyCollection 0 instance 00000000 extendedProperties
790fa490 4000730 38 System.String 0 instance 02b7b12c tableName
790fa490 4000731 3c System.String 0 instance 00000000 tableNamespace
0:020> !do 02b7b12c
String: Trace_Form_Collection

And doing some digging in old cases and in the code I found out that this is a dataset holding trace information when you have tracing enabled…

As it turns out in this case the application did have tracing enabled, and normally that is not a problem because the default number of requests to trace is 10, however this application had the requestLimit set to 65535, so as the day went on and more requests kept coming in, more data tables were added, causing memory increase and performance degradations.

<trace enabled="true" requestLimit="65535"/>

Final thoughts

So what is the moral of the story? Nothing in this world is free, if you want lots of trace information, you have to store it somewhere… but of course I have to admit, it is very easy to have 20/20 vision after you know what the problem is.

I think this is a perfect example of a case where if we did not have a debugger and a memory dump, this issue would have taken a long long time to resolve, but with a memory dump we have a very good chance of identifying the problem quickly, and luckily also, in this case the resolution is as easy as changing a configuration setting, which of course is not always the case.

Laters y’all