Posted by redbeard on May 25, 2011
Garbage collection in XNA can be a big deal if you care about framerate hitching, which is visually jarring and can cause input polling problems. After a long enough time, any per-frame allocation will eventually be subject to garbage collection, which will pause all threads and search memory for live and dead objects to make space for more allocations. The garbage collector only has an opportunity to run if you allocate memory, therefore you can control when garbage collection happens by only performing allocations during expected downtime, such as during loading screens. The same wisdom applies for writing real-time applications in native code, but .NET provides more of a safety net if you do things quick & dirty at first; the GC will clean up your mess whereas a native app might just crash when running out of memory. Reducing and optimizing your allocations can also improve your loading times and reduce your minimum-spec hardware requirements.
The best tool I’ve used for diagnosing garbage allocation & collection is CLR Profiler. It has a few quirks: the allocation graph has some rendering bugs with the line traces, it has no option to “ignore this node”, and the histogram views don’t always allow enough resolution to see all the objects which have been allocated (ie if they’re less than 1KB).
With that said, the “histogram by age” view is quite useful for finding per-frame allocations; start up your game and get it into the desired state, then just let it run for a couple of minutes with no adjustments. After running a while, open up the “histogram by age” view and see if any allocations are younger than 1 minute. An option in the right-click context menu will even show you a visual call-stack of how those allocations happened. Note that if you exit your application, it will probably generate a bunch of garbage on shutdown, so it’s probably safe to ignore very-young allocations so long as the middle ground is clear.
Another useful view is the “allocation graph”, which will show you the list of all major allocations by basic type if you scroll all the way to the right, and the visual call-stack of how they were conjured as you look to the left. This view can be a little misleading if you have large chunks of one-time allocations for scratch memory pools, and there is no option to ignore or exclude specific nodes, but anything that bubbles up to the top should warrant investigation.
Major Sources of Garbage
… as discovered in my current codebase, with suggested corrective actions.
- String manipulation:
- Never use String.Format, for multiple reasons: hidden StringBuilder allocation, value types get boxed, hidden ToString calls allocate strings, and the params arguments create an array.
- Never concatenate strings with + operator.
- Never append non-string values onto a StringBuilder, these methods call ToString under the hood.
- Never call StringBuilder.ToString, it allocates a string to return.
- Do all string manipulation with a pre-allocated capacity-capped StringBuilder, use custom functions for numeric conversion like itoa & ftoa.
- SpriteBatch can render StringBuilder objects without calling ToString.
- I use a custom StringBuilderNG class (NG = no garbage) which wraps a standard StringBuilder and forwards only the “safe” methods which generate no garbage, and implements new methods for custom conversion of int & float values. This approach is more prone to bugs, but itoa and ftoa are relatively easy to implement.
- DateTime.Now: replace with Stopwatch.Elapsed when used for profiling
- params arguments: pre-allocate an array of appropriate size, and populate it immediately before the call. Don’t forget to null out the entries afterwards to avoid dangling references.
- Value type boxing for IComparer<T> & IEnumerable<T>: implement CompareTo on the underlying type, and use an explicit value-type Enumerator like in List<T>. This also helps reduce virtual function calls.
- Worker threads typically need scratch memory to operate on, this can be pre-allocated in a pool, and threads can grab a chunk of scratch memory when they start up and release it when they’re done