I'm trying to debug an issue that has been affecting our ASP.net hosts for some time, and I've just managed to catch the problem in the wild, and get a memory dump.
This problem is active, persistent and is negatively impacting our uptime, and seems to be caused by DogStatsd.
The symptom that we see is runaway thread usage on our ASP.net hosts.
Our apps are run as "App Service" on Azure, and we are running on P2v3 hosts (4 vCPU, 16 GB RAM). We run on Windows, in Azure North Europe.
<TargetFrameworkVersion>v4.6.2</TargetFrameworkVersion>
<PackageReference Include="DogStatsD-CSharp-Client" Version="8.0.0" />
<PackageReference Include="Datadog.Trace" Version="3.3.1" />
We are hosted on https://us3.datadoghq.com/
The host I caught it on has a steady-state of ~100 threads in use as measured in Datadog by the azure.app_services.thread_count metric. The growth in this case started at 8:30am and we had 1.5k threads in use at 13:30.
At 9am, we had 248 threads, at 1pm we had 1.41k threads, which is a growth of 1,162 threads, over 4 hours that's 290 new threads per hour, 4.8 per minute, one thread created every 12 seconds if my maths is correct.
The callstack that we see on the threads is:
NtDelayExecution() Unknown
RtlDelayExecution() Unknown
SleepEx() Unknown
[Managed to Native Transition]
System.Threading.SpinWait.SpinOnce() Unknown
System.Threading.SpinLock.ContinueTryEnterWithThreadTracking(int millisecondsTimeout = 0xffffffff, uint startTime = 0x00000000, ref bool lockTaken = false) Unknown
StatsdClient.Transport.NamedPipeTransport.Send(byte[] buffer = {byte[0x0000014d]}, int length = 0x0000014d) Unknown
StatsdClient.Telemetry.SendMetricWithTags(string metricName, string[] tags, int value) Unknown
StatsdClient.Telemetry.Flush() Unknown
System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) Unknown
System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) Unknown
System.Threading.TimerQueueTimer.CallCallback() Unknown
System.Threading.TimerQueueTimer.Fire() Unknown
System.Threading.TimerQueue.FireNextTimers() Unknown
[Native to Managed Transition]
> BaseThreadInitThunk() Unknown
RtlUserThreadStart() Unknown
Do you have any idea what could be causing the thread growth or how we can prevent it?
I'm trying to debug an issue that has been affecting our ASP.net hosts for some time, and I've just managed to catch the problem in the wild, and get a memory dump.
This problem is active, persistent and is negatively impacting our uptime, and seems to be caused by DogStatsd.
The symptom that we see is runaway thread usage on our ASP.net hosts.
Our apps are run as "App Service" on Azure, and we are running on
P2v3hosts (4 vCPU, 16 GB RAM). We run on Windows, in Azure North Europe.We are hosted on https://us3.datadoghq.com/
The host I caught it on has a steady-state of ~100 threads in use as measured in Datadog by the
azure.app_services.thread_countmetric. The growth in this case started at 8:30am and we had 1.5k threads in use at 13:30.At 9am, we had 248 threads, at 1pm we had 1.41k threads, which is a growth of 1,162 threads, over 4 hours that's 290 new threads per hour, 4.8 per minute, one thread created every 12 seconds if my maths is correct.
The callstack that we see on the threads is:
Do you have any idea what could be causing the thread growth or how we can prevent it?