4 minute read

I had an interesting case today where an asp.net app got stuck in a true deadlock. In other words two threads were both waiting for resources that the other thread owned.

The scenario in these cases usually goes something like this:

  • FunctionA (locks on ResourceA) and calls FunctionB where it needs ResourceB
  • FunctionB (locks on ResourceB) and calls FunctionA where it needs ResourceA

And is typically pretty easy to spot, understand and fix… but, in this particular case things were a bit different.

We had gotten memory dumps of the process and after loading them up in windbg, loading sos.dll, and running ~* e !clrstack to see what the threads were working on, we found that a lot of the threads were sitting in System.Threading.Monitor.Enter like below, indicating that they were waiting on a lock…

0:015> !clrstack
OS Thread Id: 0x19c0 (15)
ESP       EIP
0240f1a0 7d61d06f [GCFrame: 0240f1a0]
0240f2dc 7d61d06f [HelperMethodFrame: 0240f2dc] System.Threading.Monitor.Enter(System.Object)
0240f330 10250ad1 ClassA.ThisFunctionLocksOnStringA()
0240f35c 10250a8e _Default.Page_Load(System.Object, System.EventArgs)
0240f364 66f2a7ff System.Web.Util.CalliHelper.EventArgFunctionCaller(IntPtr, System.Object, System.Object, System.EventArgs)
0240f374 660b2fb4 System.Web.Util.CalliEventHandlerDelegateProxy.Callback(System.Object, System.EventArgs)
0240f388 660ac4b4 System.Web.UI.Control.OnLoad(System.EventArgs)
0240f39c 660ac4f3 System.Web.UI.Control.LoadRecursive()
0240f3b4 660a85a4 System.Web.UI.Page.ProcessRequestMain(Boolean, Boolean)
0240f50c 660a81d4 System.Web.UI.Page.ProcessRequest(Boolean, Boolean)
0240f544 660a8101 System.Web.UI.Page.ProcessRequest()
0240f57c 660a8096 System.Web.UI.Page.ProcessRequestWithNoAssert(System.Web.HttpContext)
0240f588 660a8072 System.Web.UI.Page.ProcessRequest(System.Web.HttpContext)
0240f59c 102508a5 ASP.default_aspx.ProcessRequest(System.Web.HttpContext)
0240f5a0 660ae546 System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
0240f5d4 66081a7c System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)
0240f614 6608cd13 System.Web.HttpApplication+ApplicationStepManager.ResumeSteps(System.Exception)
0240f664 66080ffc System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext, System.AsyncCallback, System.Object)
0240f680 6608456c System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest)
0240f6b4 66084213 System.Web.HttpRuntime.ProcessRequestNoDemand(System.Web.HttpWorkerRequest)
0240f6c4 660833ac System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32)
0240f8d8 79f68c4e [ContextTransitionFrame: 0240f8d8]
0240f90c 79f68c4e [GCFrame: 0240f90c]
0240fa68 79f68c4e [ComMethodFrame: 0240fa68]

Note: I have given the methods in ClassA (and later on ClassB and ClassC) names that tell us what they are locking on so that it will be easier to follow along. If this were a real scenario you would obviously have to look at the code for ClassA.ThisFunctionLocksOnStringA() to see what it was locking on :)

Finding the deadlock

I loaded up SOSEX.dll and ran !dlk which looks for deadlocks.

0:026> !dlk
Deadlock detected:
CLR thread 26 holds sync block 001e10e4 OBJ:06e401d0[System.String]
             waits sync block 001e12c4 OBJ:06ec6618[System.String] STRVAL=LOCKA
CLR thread 55 holds sync block 001e12c4 OBJ:06ec6618[System.String] STRVAL=LOCKA
             waits sync block 001e10e4 OBJ:06e401d0[System.String]
CLR Thread 26 is waiting at ClassA.ThisFunctionAlsoLocksOnStringA()+0x6(IL)
CLR Thread 55 is waiting at ClassB.ThisFunctionLocksOnStringB()+0x6(IL)

1 deadlock detected.

This tells us that we have a deadlock between CLR Thread 26 and CLR Thread 55, where they both own locks on resources that the other thread is waiting for, so now we know that we have a deadlock and which threads are involved.

!dlk is really great for looking at deadlocks but it has one little snag and that is that it reports the CLR thread numbers which is different than the logical thread IDs that we are used to. However, we can “easily” map them to the logical thread IDs if we use the !threads output. The 2nd column in !threads shows the CLR thread ID in hex.

CLR Thread 26 (0x1a) = logical thread 44
CLR Thread 55 (0x37) = logical thread 73

0:026> !threads
ThreadCount: 65
UnstartedThread: 0
BackgroundThread: 65
PendingThread: 0
DeadThread: 0
Hosted Runtime: no
                                      PreEmptive   GC Alloc           Lock
       ID OSID ThreadOBJ    State     GC       Context       Domain   Count APT Exception
...
  44   1a 1b58 0ee85eb0   380b220 Enabled  00000000:00000000 001e6608     2 MTA (Threadpool Worker)
...
  73   37 1610 0eea4bc0   380b220 Enabled  00000000:00000000 001e6608     2 MTA (Threadpool Worker)
...

We could also have avoided doing the math here by dumping out the active locks with !sos.syncblk

0:026> !sos.syncblk
Index SyncBlock MonitorHeld Recursion Owning Thread Info  SyncBlock Owner
   22 001e10e4           41         1 0ee85eb0  1b58  44   06e401d0 System.String
   32 001e12c4           41         1 0eea4bc0  1610  73   06ec6618 System.String
-----------------------------
Total           35
CCW             2
RCW             0
ComClassFactory 0
Free            0

Syncblk here tells us that thread 44 owns the lock to syncblock 06e401d0 (so t44 is CLR Thread 26) and thread 73 owns the lock to syncblock 06ec6618 (so t73 is CLR Thread 55 in the !dlk output).

Ok, getting back off that tangent again :), now we know that 44 is waiting on 73 and 73 is waiting on 44, but why?

The stacks for thread 44 and thread 73 look like this

0:044> !clrstack
OS Thread Id: 0x1b58 (44)
ESP       EIP
116ee5f0 7d61d06f [GCFrame: 116ee5f0]
116ee72c 7d61d06f [HelperMethodFrame: 116ee72c] System.Threading.Monitor.Enter(System.Object)
116ee780 10250757 ClassA.ThisFunctionAlsoLocksOnStringA()
116ee7b0 102506bf ClassC.ThisFunctionLocksOnStringC()
116ee7dc 1025064e Default2.Page_Load(System.Object, System.EventArgs)
...
0:073> !clrstack
OS Thread Id: 0x1610 (73)
ESP       EIP
1287e770 7d61d06f [GCFrame: 1287e770]
1287e8ac 7d61d06f [HelperMethodFrame: 1287e8ac] System.Threading.Monitor.Enter(System.Object)
1287e900 10250b6f ClassB.ThisFunctionLocksOnStringB()
1287e930 10250ad7 ClassA.ThisFunctionLocksOnStringA()
1287e95c 10250a8e _Default.Page_Load(System.Object, System.EventArgs)
...

44 is waiting for the lock on StringA (where StringA=”LockA”) which 73 took in ClassA.ThisFunctionLocksOnStringA() before calling into ClassB.ThisFunctionLocksOnStringB(), so that is all well and good.

The question here is why does Thread 44 hold a lock on StringB?, there is no method on the stack that locked on StringB…

Why are we locking?

ClassB and ClassC look like this (with their own individual lock objects LockB and LockC):

public class ClassB
{
    public static string LockB = "";

    public static void ThisFunctionLocksOnStringB()
    {
        lock (LockB)
        {
        //Do some stuff
        }
    }
...
}

public class ClassC
{
    public static string LockC = "";

    public static void ThisFunctionLocksOnStringC()
    {
        lock (LockC)
        {
            ClassA.ThisFunctionAlsoLocksOnStringA();
        }
    }

...
}

At first sight, LockB and LockC look like different objects, but in reality they are just pointers to the same string constant “”, so in reality this would be the same as saying lock(“”) in both cases. In fact if you did have a 3rd function saying lock(“”) this would also have to wait until the lock for LockB or LockC was released.

Summary

When using locks, best practice is to use a private static object so that you can control who/where code locks on this object.

There’s some good guidelines for locks in general in Microsoft docs and Guidance Share Wikipedia

Laters, Tess