9 minute read

I have talked about blocked finalizers before and this issue is a special case of blocked finalizers where we are not really finalizing a .NET object but rather cleaning up all Com Callable Wrappers (CCWs) and getting stuck doing that.

Problem description

The situation we are facing is that the ASP.NET process starts to go hey wire both in response times and in memory usage. All of a sudden memory just starts growing and growing and we need to figure out why.

Troubleshooting

We took a memory dump when memory usage was high (~837 MB private bytes which we could see from the dump size) with adplus -hang -pn w3wp.exe, but really the dump could have just been taken anytime after the memory started rapidly increasing.

Around ~469 MB of the memory are pure .net objects (as we can see from GC Heap Size in !eeheap -gc) and the large object segments are minimal so there isn’t a whole lot of large objects…

0:015> !eeheap -gc
Number of GC Heaps: 2
------------------------------
Heap 0 (000db128)
generation 0 starts at 0x33bc0c98
generation 1 starts at 0x338bc070
generation 2 starts at 0x02410038
ephemeral segment allocation context: none
 segment    begin allocated     size            reserved
00176d48 7b451688  7b467f9c 0x00016914(92,436) 00003000
000e96e0 7a721784  7a74248c 0x00020d08(134,408) 00003000
000e1ee0 790d6358  790f5800 0x0001f4a8(128,168) 00003000
02410000 02410038  0640f130 0x03fff0f8(67,105,016) 01fdf000
17d80000 17d80038  1bd69da8 0x03fe9d70(67,018,096) 0000e000
2a070000 2a070038  2e063b38 0x03ff3b00(67,058,432) 00009000
321d0000 321d0038  33caad50 0x01adad18(28,159,256) 01cb6000
Large object heap starts at 0x0a410038
 segment    begin allocated     size            reserved
0a410000 0a410038  0a4234f0 0x000134b8(79,032) 01fdf000
Heap Size  0xdb215fc(229,774,844)
------------------------------
Heap 1 (000dc138)
generation 0 starts at 0x38146e28
generation 1 starts at 0x37e5033c
generation 2 starts at 0x06410038
ephemeral segment allocation context: none
 segment    begin allocated     size            reserved
06410000 06410038  0a40ee00 0x03ffedc8(67,104,200) 01fdf000
1bd80000 1bd80038  1fce8e6c 0x03f68e34(66,489,908) 00077000
26070000 26070038  2a05b85c 0x03feb824(67,024,932) 03ff7000
36510000 36510038  38a8cc68 0x0257cc30(39,308,336) 01a7c000
Large object heap starts at 0x0c410038
 segment    begin allocated     size            reserved
0c410000 0c410038  0c410048 0x00000010(16) 01fff000
Heap Size  0xe4d0060(239,927,392)
------------------------------
GC Heap Size  0x1bff165c(469,702,236)

If we dump out the managed heaps with !dumpheap -stat the most memory consuming objects are shown below.

...
0ec99d1c     38,575       617,200 Oracle.DataAccess.Client.OracleParameterCollection
68a35670     11,135       623,560 System.Web.UI.WebControls.HiddenField
0ec693f4        848       630,912 ASP.aspx_mypage_aspx
7a742f30     37,751       755,020 System.ComponentModel.EventHandlerList+ListEntry
68a652e8     73,493       881,916 System.Web.UI.WebControls.BorderStyle
7a753288     55,789       892,624 System.Collections.Specialized.NameObjectCollectionBase+NameObjectEntry
653c5b14     15,019       961,216 System.Data.DataRow
0ec96d64     34,834       975,352 Oracle.DataAccess.Client.MetaData
79124670      3,936       985,932 System.Char[]
68a2c690     14,989     1,019,252 System.Web.UI.HtmlControls.HtmlTableRow
653c3e94      7,703     1,140,044 System.Data.DataColumn
653d61c0      1,137     1,177,932 System.Data.RBTree`1+Node[[System.Data.DataRow, System.Data]][]
0ec9bd4c     36,872     1,179,904 Oracle.DataAccess.Client.OracleRowUpdatingEventHandler
10fa17dc     13,293     1,329,300 MyControls.MyCheckBox
791052bc    117,882     1,414,584 System.Int16
68a37884     17,827     1,426,160 System.Web.UI.WebControls.Label
79104f64    151,695     1,820,340 System.Boolean
68a3e7a0    155,042     1,860,504 System.Web.UI.WebControls.FontInfo
0ec634fc     25,512     2,040,960 MyControls.MyTextBox
68a7d910     26,604     2,128,320 System.Web.UI.WebControls.Button
68a2f704    105,846     2,540,304 System.Web.UI.CssStyleCollection
0ec96c84     98,136     2,747,808 Oracle.DataAccess.Client.ColMetaRef
68a2f694    202,158     3,234,528 System.Web.UI.AttributeCollection
79103b6c     75,803     3,335,332 System.Threading.ReaderWriterLock
68a2f7fc     94,486     3,401,496 System.Web.UI.ControlCollection
0ec9ae14     36,872     3,687,200 Oracle.DataAccess.Client.OracleDataAdapter
68a6423c    150,531     4,816,992 System.Web.UI.WebControls.FontUnit
68a2c918     85,276     5,457,664 System.Web.UI.HtmlControls.HtmlTableCell
0ec6322c     69,949     5,595,920 MyControls.MyLabel
0ec9b404     36,872     5,604,544 Oracle.DataAccess.Client.OracleCommandBuilder
68a31bdc    476,216     5,714,592 System.Web.UI.IndexedString
68a884a0     96,876     5,812,560 System.Web.UI.LiteralControl
68a7ca28    365,718     5,851,488 System.Web.UI.Pair
7ae75d7c    252,629     6,063,096 System.Drawing.Color
68a85ea0    392,440     6,279,040 System.Web.UI.StateBag
0ec9975c     40,043     6,406,880 Oracle.DataAccess.Client.OracleCommand
68a7745c    159,338     7,010,872 System.Web.UI.WebControls.Style
790fea70    139,382     7,805,392 System.Collections.Hashtable
68a92e2c    177,707     7,819,108 System.Web.UI.Control+OccasionalFields
68a86004    331,329     7,951,896 System.Web.UI.WebControls.Unit
790fed1c    698,227     8,378,724 System.Int32
791036b0    401,277     9,630,648 System.Collections.ArrayList
000daa58        296    10,030,596      Free
7a747bac    378,430    10,596,040 System.Collections.Specialized.ListDictionary
7a747ad4    608,586    12,171,720 System.Collections.Specialized.HybridDictionary
7a747c78    837,066    16,741,320 System.Collections.Specialized.ListDictionary+DictionaryNode
79124418      4,555    17,726,104 System.Byte[]
0ec9a664    165,910    17,918,280 Oracle.DataAccess.Client.OracleParameter
68a131b4  1,758,380    28,134,080 System.Web.UI.StateItem
79124228    705,459    31,266,188 System.Object[]
791242ec    142,053    43,835,496 System.Collections.Hashtable+bucket[]
790fa3e0  1,032,738   116,787,020 System.String
Total 11,377,657 objects, Total size: 469,685,264

So here, most of the memory is going to either data related items, UI related items or generic objects. In short, let’s say we got rid of all the ASP.aspx_mypage_aspx objects for example, then it is very likely that we would get rid of most of the UI related items. And if we got rid of all the DataSets in the dump we would probably get rid of all or most of the data related items.

If I focus on the aspx pages and dump them out, and then !gcroot a few of those aspx pages to see where they are rooted we get this.

0:000> !dumpheap -mt 0ec693f4
Using our cache to search the heap.
   Address         MT     Size  Gen
0251f0ec 0ec693f4      744    2 ASP.aspx_mypage_aspx
02555c90 0ec693f4      744    2 ASP.aspx_mypage_aspx
0259cfc0 0ec693f4      744    2 ASP.aspx_mypage_aspx
025d3ca0 0ec693f4      744    2 ASP.aspx_mypage_aspx
0260a85c 0ec693f4      744    2 ASP.aspx_mypage_aspx
02641294 0ec693f4      744    2 ASP.aspx_mypage_aspx
02677f44 0ec693f4      744    2 ASP.aspx_mypage_aspx
026aeaf4 0ec693f4      744    2 ASP.aspx_mypage_aspx
0270622c 0ec693f4      744    2 ASP.aspx_mypage_aspx
...
0:015> !gcroot 0251f0ec
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 15 OSTHread a28
Scan Thread 19 OSTHread 43c
Scan Thread 20 OSTHread b94
Scan Thread 21 OSTHread 15c
Scan Thread 13 OSTHread ce8
Scan Thread 22 OSTHread 750
Scan Thread 5 OSTHread d3c
Scan Thread 24 OSTHread 234
Scan Thread 7 OSTHread dc0
Scan Thread 25 OSTHread c28
Scan Thread 26 OSTHread 528
Scan Thread 8 OSTHread aec
Scan Thread 6 OSTHread 754
Scan Thread 27 OSTHread bc8
Scan Thread 28 OSTHread fb4
Scan Thread 29 OSTHread 20c
Scan Thread 31 OSTHread 88c
Finalizer queue:Root:  0251f0ec(ASP.aspx_mypage_aspx)
Finalizer queue:Root:  02523b0c(MyControls.MyTextBox)->
  02523aa8(MyControls.MyWebTextBox)->
  0251f0ec(ASP.aspx_mypage_aspx)
Finalizer queue:Root:  025243e0(MyControls.MyClientEvents)->
  02523aa8(MyControls.MyWebTextBox)->
  0251f0ec(ASP.aspx_mypage_aspx)
Finalizer queue:Root:  02524d88(MyControls.MyTextBox)->
  02524d24(MyControls.MyWebTextBox)->
  0251f0ec(ASP.aspx_mypage_aspx)
Finalizer queue:Root:  0252565c(MyControls.MyClientEvents)->
  02524d24(MyControls.MyWebTextBox)->
  0251f0ec(ASP.aspx_mypage_aspx)

The page is rooted in the finalizer queue, both by itself and as a member variable to a MyWebTextBox object (probably the _parent member variable).

This means two things

  1. The finalizer is probably blocked, otherwise items usually won’t be rooted in the Finalizer queue long enough for us to catch it in a dump
  2. The ASP.aspx_mypage_aspx, the MyTextBox and the MyClientEvents classes all have finalizers or derive from classes that have finalizers which seems very odd and is probably unnecessary. This article is a good resource for understanding how the finalizer/dispose pattern should be implemented, and this blog post explains why unnecessary finalizers are bad (under the topic “What can make us spend a lot of our time in GC?”)

If the finalizer is blocked no objects can be finalized which will lead to rapid memory growth, so a blocked finalizer will most definitely explain our problem. The reason these objects (UI and Data objects) don’t go away then is of course because they would have to be finalized to be garbage collected.

So, let’s have a look at the finalizer thread

0:015> !threads
ThreadCount: 27
UnstartedThread: 0
BackgroundThread: 17
PendingThread: 0
DeadThread: 10
Hosted Runtime: no
                                      PreEmptive   GC Alloc           Lock
       ID OSID ThreadOBJ    State     GC       Context       Domain   Count APT Exception
  15    1  a28 000d75f0   1808220 Enabled  3823b58c:3823bb08 000d3fe8     0 STA (Threadpool Worker)
  19    2  43c 000dd5f0      b220 Enabled  00000000:00000000 000d3fe8     0 MTA (Finalizer)
  20    3  b94 000f20b0    80a220 Enabled  00000000:00000000 000d3fe8     0 MTA (Threadpool Completion Port)
  21    4  15c 000f5318      1220 Enabled  00000000:00000000 000d3fe8     0 Ukn
  13    5  ce8 0014c3f0   880a220 Enabled  00000000:00000000 000d3fe8     0 MTA (Threadpool Completion Port)
  22    6  750 00122f70   880b220 Enabled  00000000:00000000 000d3fe8     0 MTA (Threadpool Completion Port)
   5    7  d3c 00160890       220 Enabled  00000000:00000000 000d3fe8     0 Ukn
  24    8  234 0f8a90d8   180b220 Enabled  3816cea0:3816ee28 000d3fe8     0 MTA (Threadpool Worker)
   7    a  dc0 0f8ae008       220 Enabled  00000000:00000000 000d3fe8     0 Ukn
XXXX    b    0 0015d0c8   1801820 Enabled  00000000:00000000 000d3fe8     0 Ukn (Threadpool Worker)
XXXX    c    0 115a0f68   1801820 Enabled  00000000:00000000 000d3fe8     0 Ukn (Threadpool Worker)
XXXX    d    0 11dadf88   1801820 Enabled  00000000:00000000 000d3fe8     0 Ukn (Threadpool Worker)
XXXX    e    0 13168018   1801820 Enabled  00000000:00000000 000d3fe8     0 Ukn (Threadpool Worker)
  25    f  c28 1441b018   180b220 Enabled  3823bde4:3823db08 000d3fe8     0 MTA (Threadpool Worker)
XXXX   10    0 1352b3b8   1801820 Enabled  00000000:00000000 000d3fe8     0 Ukn (Threadpool Worker)
  26   11  528 1433d008       220 Enabled  00000000:00000000 000d3fe8     0 Ukn
   8   12  aec 14416bc0       220 Enabled  00000000:00000000 000d3fe8     0 Ukn
   6   13  754 1444b828       220 Enabled  00000000:00000000 000d3fe8     0 Ukn
  27   14  bc8 14526248       220 Enabled  00000000:00000000 000d3fe8     0 Ukn
  28   15  fb4 145cb220       220 Enabled  00000000:00000000 000d3fe8     0 Ukn
XXXX   16    0 1461a008   1801820 Enabled  00000000:00000000 000d3fe8     0 Ukn (Threadpool Worker)
  29   17  20c 147abac8       220 Enabled  00000000:00000000 000d3fe8     0 Ukn
XXXX   18    0 164a02e8   1801820 Enabled  00000000:00000000 000d3fe8     0 Ukn (Threadpool Worker)
XXXX   19    0 230be290   1801820 Enabled  00000000:00000000 000d3fe8     0 Ukn (Threadpool Worker)
XXXX    9    0 0f8a99a8   1801820 Enabled  00000000:00000000 000d3fe8     0 Ukn (Threadpool Worker)
XXXX   1a    0 227c2f18   1801820 Enabled  00000000:00000000 000d3fe8     0 Ukn (Threadpool Worker)
  31   1b  88c 227d1c98       220 Enabled  00000000:00000000 000d3fe8     0 Ukn
0:015> ~19s
eax=01eff518 ebx=00099cd0 ecx=01eff518 edx=7743345e esi=00000740 edi=00000000
eip=7c97ed54 esp=01eff254 ebp=01eff2c4 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
ntdll!KiFastSystemCallRet:
7c97ed54 c3              ret

0:019> kb 2000
ChildEBP RetAddr
01eff250 7c972124 ntdll!KiFastSystemCallRet
01eff254 7c82baa8 ntdll!NtWaitForSingleObject+0xc
01eff2c4 7c82ba12 kernel32!WaitForSingleObjectEx+0xac
01eff2d8 774854ef kernel32!WaitForSingleObject+0x12
01eff2f4 77549905 ole32!GetToSTA+0x6f
01eff314 77547ed7 ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0xcb
01eff3f4 77455349 ole32!CRpcChannelBuffer::SendReceive2+0xc1
01eff460 77484eee ole32!CAptRpcChnl::SendReceive+0xab
01eff4b4 77c8127e ole32!CCtxComChnl::SendReceive+0x91
01eff4d0 77c813ca rpcrt4!NdrProxySendReceive+0x43
01eff8b8 77c811bd rpcrt4!NdrClientCall2+0x206
01eff8d8 77bf3a12 rpcrt4!ObjectStublessClient+0x8b
01eff8e8 77484c23 rpcrt4!ObjectStubless+0xf
01eff970 774d28f3 ole32!CObjectContext::InternalContextCallback+0x126
01eff9c0 79f2340f ole32!CObjectContext::ContextCallback+0x85
01effa0c 79f23362 mscorwks!CtxEntry::EnterContextOle32BugAware+0x2b
01effb2c 79f231ef mscorwks!CtxEntry::EnterContext+0x2db
01effb60 7a0426dc mscorwks!RCWCleanupList::ReleaseRCWListInCorrectCtx+0xc4
01effbb0 79f23058 mscorwks!RCWCleanupList::CleanupAllWrappers+0x77
01effbf4 79f75f2d mscorwks!SyncBlockCache::CleanupSyncBlocks+0xec
01effdb8 79f34ff2 mscorwks!Thread::DoExtraWorkForFinalizer+0x40
01effdc8 79ecb4a4 mscorwks!SVR::GCHeap::FinalizerThreadWorker+0xc6
01effdd8 79ecb442 mscorwks!ManagedThreadBase_DispatchInner+0x4d
01effe6c 79ecb364 mscorwks!ManagedThreadBase_DispatchMiddle+0xb0
01effea8 79ed5e8b mscorwks!ManagedThreadBase_DispatchOuter+0x6d
01effed0 79ed5e56 mscorwks!ManagedThreadBase_NoADTransition+0x32
01effedc 79f6fd87 mscorwks!ManagedThreadBase::FinalizerBase+0xb
01efff14 79ecb00b mscorwks!SVR::GCHeap::FinalizerThreadStart+0xbb
01efffb8 7c826063 mscorwks!Thread::intermediateThreadProc+0x49
01efffec 00000000 kernel32!BaseThreadStart+0x34

The finalizer thread (which we found from the !threads output) is definitely doing something and based on what we know about objects being rooted in the Finalizer queue it has probably been stuck here for a while.

The finalizer is not necessarily finalizing a managed object, instead it is cleaning up all com wrappers, and to release them it has to get on an STA thread.

If we look at the !threads output again, specifically at the APT (Apartment) column we can see that our only STA thread is thread 15 so that is likely where it is trying to go.

0:019> ~15s
eax=2d220b44 ebx=000d75f0 ecx=2d220ca4 edx=000b4883 esi=0000022c edi=00000000
eip=7c97ed54 esp=01b1fc70 ebp=01b1fce0 iopl=0         nv up ei ng nz ac pe cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000297
ntdll!KiFastSystemCallRet:
7c97ed54 c3              ret
0:015> kbL 2000
ChildEBP RetAddr  Args to Child
01b1fc6c 7c972124 7c82baa8 0000022c 00000000 ntdll!KiFastSystemCallRet
01b1fc70 7c82baa8 0000022c 00000000 01b1fcb4 ntdll!NtWaitForSingleObject+0xc
01b1fce0 79e77fd1 0000022c 00009c40 00000000 kernel32!WaitForSingleObjectEx+0xac
01b1fd24 79e77f9a 0000022c 00009c40 00000000 mscorwks!CLREventWaitHelper+0x2f
01b1fd74 79e77f50 00009c40 00000000 00000000 mscorwks!CLREvent::WaitEx+0x117
01b1fd84 79f5b69c 00009c40 00000000 00000000 mscorwks!CLREvent::Wait+0x17
01b1fe04 7a112052 000bf560 00009c40 00000000 mscorwks!ThreadpoolMgr::SafeWait+0x73
01b1fe2c 7a112206 eeb88f9d 00000000 7a1120b5 mscorwks!ThreadpoolMgr::EnterRetirement+0x8e
01b1fe94 79f71123 00000000 00000000 00000000 mscorwks!ThreadpoolMgr::WorkerThreadStart+0x360
01b1ffb8 7c826063 000bcbf0 00000000 00000000 mscorwks!ThreadpoolMgr::intermediateThreadProc+0x49
01b1ffec 00000000 79f710dd 000bcbf0 00000000 kernel32!BaseThreadStart+0x34

Thread 15 however is just a regular .net worker thread waiting for work, and as such it is not pumping messages so thread 19 (finalizer) will be indefinitely blocked in GetToSTA.

The symptom is very very similar to that explained in this KB Article and in this kb the reason for the issue is that the main thread is configured with the attribute [STAThread] and the suggested solution is to change this to [MTAThread] or to manually pump messages using Thread.CurrentThread.Join(100), neither of which really apply in an ASP.NET application.

The problem in ASP.NET is that the first CLR threadpool worker thread is never initialized for COM+ so if someone instantiates a component there that calls CoInitialize, the thread can become an STA thread and cause this type of blocking.

Solution

A hotfix was recently released which changes this behavior, so if you run into this issue you can install that. If you also have a lot of unnecessary finalizers like in this particular case, I would recommend that you take care of them since it causes a lot of unnecessary memory usage and higher CPU in GC because of the amount of collections that have to be done to release them and the amount of data the GC has to work through.

Laters all,

Tess