windbg - Help me analyze dump file -


customers reporting problems every day on same hours. app running on 2 nodes. metastorm bpm platform , it's calling our code.

in dumps noticed long running threads (~50 minutes) not in of them. administrators telling me before users report problems memory usage goes up. slows down point can't work , admins have restart platforms on both nodes. first thought deadlocks (long running threads) didn't manage confirm that. !syncblk isn't returning anything. looked @ memory usage. noticed lot of dynamic assemblies thought maybe assemblies leak. looks it's not that. have received dump day working fine , number of dynamic assemblies similar. maybe memory leak thought. cannot confirm that. !dumpheap -stat shows memory usage grows haven't found interesting !gcroot. there 1 thing don't know is. threadpool completion port. there's lot of them. maybe sth waiting on sth? here data can give far fit in post. suggest diagnose situation?

users not reporting problems:                     node1                       node2 size of dump:       638mb                       646mb dynamicassemblies   259                         265 gc heaps:           37mb                        35mb                     loader heaps:       11mb                        11mb  node1: number of timers: 12 cpu utilization 2% worker thread: total: 5 running: 0 idle: 5 maxlimit: 2000 minlimit: 200 completion port thread:total: 2 free: 2 maxfree: 16 currentlimit: 4 maxlimit: 1000 minlimit: 8  !dumpheap -stat (biggest) 0x793041d0   32,664    2,563,292 system.object[] 0x79332b9c   23,072    3,485,624 system.int32[] 0x79330a00   46,823    3,530,664 system.string 0x79333470   22,549    4,049,536 system.byte[]  node2: number of timers: 12 cpu utilization 0% worker thread: total: 7 running: 0 idle: 7 maxlimit: 2000 minlimit: 200 completion port thread:total: 3 free: 1 maxfree: 16 currentlimit: 5 maxlimit: 1000 minlimit: 8  !dumpheap -stat 0x793041d0   30,678    2,537,272 system.object[] 0x79332b9c   21,589    3,298,488 system.int32[] 0x79333470   21,825    3,680,000 system.byte[] 0x79330a00   46,938    5,446,576 system.string ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------  users start report problems:                     node1                      node2 size of dump:       662mb                       655mb dynamicassemblies   236                         235 gc heaps:           159mb                       113mb                    loader heaps:       10mb                        10mb  node1: work request in queue: 0 number of timers: 14 cpu utilization 20% worker thread: total: 7 running: 0 idle: 7 maxlimit: 2000 minlimit: 200 completion port thread:total: 48 free: 1 maxfree: 16 currentlimit: 49 maxlimit: 1000 minlimit: 8  !dumpheap -stat 0x7932a208   88,974    3,914,856 system.threading.readerwriterlock 0x79333054   71,397    3,998,232 system.collections.hashtable 0x24f70350  319,053    5,104,848 our.class 0x79332b9c   53,190    6,821,588 system.int32[] 0x79333470   52,693    6,883,120 system.byte[] 0x79333150   72,900   11,081,328 system.collections.hashtable+bucket[] 0x793041d0  247,011   26,229,980 system.object[] 0x79330a00  644,807   34,144,396 system.string  node2: work request in queue: 1 number of timers: 17 cpu utilization 17% worker thread: total: 6 running: 0 idle: 6 maxlimit: 2000 minlimit: 200 completion port thread:total: 48 free: 2 maxfree: 16 currentlimit: 49 maxlimit: 1000 minlimit: 8  !dumpheap -stat 0x7932a208   76,425    3,362,700 system.threading.readerwriterlock 0x79332b9c   42,417    5,695,492 system.int32[] 0x79333150   41,172    6,451,368 system.collections.hashtable+bucket[] 0x79333470   44,052    6,792,004 system.byte[] 0x793041d0  175,973   18,573,780 system.object[] 0x79330a00  397,361   21,489,204 system.string  

edit: downloaded debugdiag , let analyze dumps. here part of output:

 following threads in process_name name_of_dump.dmp making com call thread 193 within same process in turn waiting on data returned server via winsock.   call winsock originated 0x0107b03b , destined port xxxx @ ip address xxx.xxx.xxx.xxx   ( 18 76 172 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 210 211 212 213 214 215 216 217 218 224 225 226 227 228 229 231 232 233 236 239 )  14,79% of threads blocked 

and recommendation is:

several threads making calls same sta thread can cause performance bottleneck due serialization. server side com servers recommended thread aware , follow mta guidelines when multiple threads sharing same object instance.

i checked using windbg thread 193 does. calling our code. our code calling metastorm engine code , hangs on remoting call. !runaway shows hanging 8 seconds. not long. checked waiting threads. except thread 18 are:

system.threading._iocompletioncallback.performiocompletioncallback(uint32, uint32, system.threading.nativeoverlapped*)
understand one, why many of them. specific business process modeling engine we're using or typical? guess it's taking threads used other clients , that's why slowdown reported users. threads completion port threads asked before? can more diagnose or did found our code cause?

from looks of output of memory not on .net heaps (only 35 mb out of ~650) if looking @ .net heaps think looking in wrong place. memory either in assemblies or in native memory if using native component file transfers or similar. want use debug diag monitor that.

it hard if leaking dynamic assemblies without looking @ pattern of growth suggest that @ perfmon , #current assemblies see if keeps growing on time, if have investigate further looking @ dynamic assemblies !dda


Comments

Popular posts from this blog

c# - SharpSVN - How to get the previous revision? -

c++ - Is it possible to compile a VST on linux? -

url - Querystring manipulation of email Address in PHP -