https://www.tessferrandez.com/blog/2008/03/17/net-debugging-demos-lab-6.html
It’s been about a week since I posted the debugging challenge for this lab, things have been a bit busy lately so sorry about the tardiness on posting the lab.
I have a love-hate relationship with statistics, I like them cause you can use them to get a point across and they can help you analyse things, but at the same time I hate them because people have a tendency to rely on them blindly and use them out of context without understanding the meaning behind them. The classic example is of course the sentence “More than 98 percent of convicted felons are bread eaters”, that sentence is excellent to combat badly used statistics in a discussion :).
Since the start of the debugging demos, over 5000 people have downloaded the labs. (wow, there’s a lot of debuggers out there :)) If half of these have done the labs that means that the buggy bits site has gotten close to 30 million hits given how much we are stressing it with tinyget, that’s more than some of the biggest online newspapers in Sweden have gotten in that time… YAY!
Of course, looking at it like this, knowing the details, these statistics are completely useless, but if you didn’t have this context you could end up with something along the lines of the WTF story I’ve got the monkey now, a really funny read :)
Anyways, on to todays lab… we are going to look at the shipping details for bugspray 1000 times and watch it create the memory leak posted in the Lab 6: Debugging Challenge
Previous labs and setup instructionsPermalink
If you are new to the debugging labs, here you can find information on how to set up the labs as well as links to the previous labs in the series.
- Information and setup instructions
- Lab 1: Hang
- Lab 2: Crash
- Lab 3: Memory
- Lab 4: High CPU hang
- Lab 5: Crash
Problem descriptionPermalink
We have started getting out of memory exceptions on the buggy bits site and we have been able to determine a scenario in which we think we are leaking memory but we can’t seem to figure out where the memory is going.
The leak seems to be occurring on our ProductInfo page for example https://localhost:44350/Products/Details/Bugspray
and we can reproduce it by stress testing.
It seems like it is leaking just a small bit every time but since it is something that customers look at a lot and over time the process will crash with an out of memory exception.
Reproduce the issue and gather dataPermalink
- Start the application and browse to the product details page
-
Monitor the .net performance counters
dotnet-counters monitor -n iisexpress
-
Stress the application
.\tinyget.ps1 -url https://localhost:44350/Products/Details/Bugspray -numTimes 1000
-
After tinyget has finished, capture a memory dump
dotnet-dump collect -n iisexpress
or with procdump
procdump64.exe -ma iisexpress.exe
Review the performance counters to figure out what we are leakingPermalink
-
Compare the
Working Set
toGC Heap Size
- Do the numbers seem to match or do they diverge? Based on this, can you tell if the issue we are facing is a virtual bytes leak, a native leak or a .NET leak?
-
Look at the
Number of Assemblies Loaded
counter- Should this counter stay flat or is it ok for this counter to increase like this? What does it mean?
Debug the memory dumpPermalink
If there is a big discrepancy between Working Set
and GC Heap Size
, and they don’t seem to follow each other, we either have a native leak which means that we have a native component that is leaking (in which case debug diag would be the next step), or we have an assembly leak.
-
Open the memory dump, load up the symbols and load sos.dll (see information and setup instructions for more info)
- What is the size of the memory dump (on disk)?
-
Run
!eeheap -gc
and!dumpheap -stat
- What is the size of the .NET heap according to
!eeheap -gc
, why is it different fromGC Heap Size
?
We saw from performance monitor that we appeared to be leaking assemblies, so the next step is to determine where these assemblies are created and why we are leaking them.
- What is the size of the .NET heap according to
-
Run
!dumpdomain
to look at the assemblies loaded in the process- Which domain has most assemblies loaded? Note: this question makes more sense on a server where you have multiple sites running
- Are these dynamic assemblies or assemblies loaded from disk? (is there a path associated with them)
-
Dump the module contents using
!dumpmodule <moduleaddress>
where module address is the address given right after Module Name on one or a few of the dynamic assemblies. eg. in the example below you would run!dumpmodule 00007ffcfe532d58
Assembly: 0000025392f4b4f0 (Dynamic) [] ClassLoader: 000002539340E710 Module Name 00007ffcfe532d58 Dynamic Module
-
Run
dc <MetaDataStart> <MetaDataEnd>
to dump out the metadata for the module and find out what is implemented in this dynamic assembly. eg. in the example below you would run dc 114d09e4 114d09e4+0n4184Note: We use the start address + 0n4184 because the metadata is 4148 bytes and the 0n stands for decimal
0:000> !dumpmodule 11b7e900 Name: gyq9ceq2, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null Attributes: PEFile Assembly: 158770d0 LoaderHeap: 00000000 TypeDefToMethodTableMap: 16e2a2c4 TypeRefToMethodTableMap: 16e2a2dc MethodDefToDescMap: 16e2a33c FieldDefToDescMap: 16e2a3a8 MemberRefToDescMap: 16e2a3d4 FileReferencesMap: 16e2a4c4 AssemblyReferencesMap: 16e2a4c8 MetaData start address: 114d09e4 (4184 bytes)
Note: If your assembly does not show MetaData start address - you have to take a bit of a leap with me and find out from the raw memory where the metadata starts - the example below is for a module at the address
00007ffcfe532d58
0:000> dp 00007ffcfe532d58 00007ffc`fe532d58 00007ffd`5c570f88 00000253`93187ffa 00007ffc`fe532d68 00000253`92f4b470 00000000`00000000 00007ffc`fe532d78 00000002`00200831 00000000`00000000 00007ffc`fe532d88 00000253`92f4b4f0 00000000`26000000 00007ffc`fe532d98 ffffffff`ffffffff 00000000`ffffffff 00007ffc`fe532da8 00000000`00000000 00000000`00000000 00007ffc`fe532db8 00000000`020007d0 00000000`c0000000 00007ffc`fe532dc8 ffffffff`ffffffff 00000000`ffffffff
Take the 2nd address
0000025393187ffa
- this is where the metadata is stored, and print out the contents from there and 1000 bytes ondc 0000025393187ffa 0000025393187ffa+0n1000
- What type of assembly was this? What is it used for? How is it generated?
Putting it all together and determining the cause of the assembly leakPermalink
If we look at the documentation for XmlSerializer we get the following information about dynamically generated assemblies related to XmlSerialization
Dynamically Generated Assemblies To increase performance, the XML serialization infrastructure dynamically generates assemblies to serialize and deserialize specified types. The infrastructure finds and reuses those assemblies. This behavior occurs only when using the following constructors:
- XmlSerializer..::.XmlSerializer(Type)
- XmlSerializer..::.XmlSerializer(Type, String)
If you use any of the other constructors, multiple versions of the same assembly are generated and never unloaded, which results in a memory leak and poor performance. The easiest solution is to use one of the previously mentioned two constructors. Otherwise, you must cache the assemblies in a Hashtable…
From this, and the fact that our performance logs and dump shows that we are continuously generating new XML serialization assemblies we can conclude that it is very likely that we are not using one of the standard constructors. Search the project code for new XmlSerializer or use reflector like in this example to determine where we are generating these dynamic assemblies.
- What method / line of code is causing the problem?
Resolve the issue and rerun the test to verify the solutionPermalink
- Resolve the issue by caching the XmlSerializer using the sample in the documentation for XmlSerializer.
- Rerun the test to verify that the assembly “leak” no longer exists.
Have fun,
Tess
Categories: blog
Updated: March 17, 2008