Eureka! Garbage-free string manipulation in Unity!

If anybody else out there is as much of a performance snob as I am (or if you’re seeing certain frames taking 250ms or more on mobile platforms), you might have noticed (or researched) that no matter what you try, string.Format and StringBuilder.ToString() will always create a little bit (sometimes a lot) of garbage. I came across this issue a little over a year ago in CosmoKnots when trying to serialize replay and analytics data for web storage, and it pretty much killed that feature for mobile devices (which is a real shame, because I pretty much designed the performance analytics system specifically for mobile builds). I had resigned myself to this fate… until now.

EDIT: There was some confusion about whether Unity 5 still requires this hack. It all comes from this slide, which was apparently talking just about the benefits of IL2CPP? Either way, I could swear that I tested this during the U5 beta, and calling StringBuilder.ToString() didn’t generate garbage, but a recent test (5.0.3f) did show garbage. If you find that StringBuilder.ToString() or string.Format() is creating garbage, then I guess you still need it.

I came back to this issue when doing tests of Unity games on Google Glass. I wanted to see what was possible, and the answer is “not much.” Though Glass has a reasonably powerful dual-core CPU, it’s underclocked to 800Mhz for power considerations, and the GPU is probably in a similar situation. Regardless, even with its tiny screen, it under-performs the Galaxy Nexus which is based on the same chip. Because of this, I went crazy with optimizations. After having eliminated all other garbage collection (OK, maybe there were a FEW things left that create garbage every so often, certainly not every frame) I came back to my FPS counter, which was formatting a string every 0.5s and creating 1.4K of garbage in doing so (even though the string is only 10 characters long!!). Back at Google I/O ’09 I saw a talk by Chris Pruett. His largest, overarching point was “treat Java like C++.” He noted that on devices back then (even now) a single GC pass can take up to 300ms, (~10 frames!) which is murder on your performance. The answer? Never free anything!

After some (more) research, I finally came upon the solution for strings in C#! Oddly enough, it took finding two different articles, both from the same blog:
http://www.gavpugh.com/2010/04/05/xnac-a-garbage-free-stringbuilder-format-method/
http://www.gavpugh.com/2010/03/23/xnac-stringbuilder-to-string-with-no-garbage/

The first is a set of alternate string.Format functions (which frustratingly use a unique format specification, so you’ll have to specify different format strings if you need to fall back on certain platforms… more on that later), and the second is a method for extracting the internal string from the stringbuilder to avoid the garbage created on the ToString() call. There are a few other Unity-specific and convenience additions that I have made:

Firstly, I created a wrapper for the code in the second article which looks something like this:

public static string GarbageFreeString(StringBuilder sb) {
	string str = (string)sb.GetType().GetField(
		"_str",
		System.Reflection.BindingFlags.NonPublic |
		System.Reflection.BindingFlags.Instance).GetValue(sb);

	//Optional: clear out the string
	//for (int i = 0; i < sb.Capacity; i++) {
	//	sb.Append(" ");
	//}
	return str;
}

As the article explains, you have to instantiate the stringbuilder with a size/capacity that is greater than or equal to the maximum size of the string you would like to put inside. This is a little tricky and obviously won’t cover all use cases. If you need to format enormous string, odds are the garbage generated won’t be your performance limitation. I should also point out that there is an overhead cost to using Reflection. For one thing, you definitely can’t strip the bytecode from your assemblies, which some people like to do to reduce build size and memory footprint on mobile platforms, but also depending on your situation, if you make a lot of calls to GarbageFreeString per frame, the saved time in GC might be offset by Reflection time. I’m not sure if there’s any gain to caching the result of GetField (GetValue is still Reflection, so you can’t elminate it entirely), but it’s worth a try if you need to squeeze out that last ounce of performance.

The second limitation is that setting sb.Length = 0 won’t actually clear the string. It resets the cursor to the beginning of the string, but if you append a new string that is less than the length of the previous string, you have to overwrite the rest of it. Now that I think about it, I should probably clear the string with a null character rather than a space. Either way, the trick that I use is to set my font to something other than dynamic (I was doing that anyway because re-generating the dynamic font creates garbage and is expensive), and rely on the fact that Unity will ignore characters that aren’t rendered in the font texture. In my case, I leave the clear code in the above code commented, and just do the following (note the for loop after appending the format string):

sb.Length = 0;
sb.ConcatFormat("FPS: {0:0.00}", fps);
for (int i = sb.Length; i < sb.Capacity; i++) {
	sb.Append("?");								//Fill with unsupported character to overwrite old string
}
myText.text = "";
myText.text = format;

The string format, by the way, is declared in the class scope (so it sticks around for the life-cycle of the object) and set = to GarbageFreeString. It might look something like this:

private string format;
private StringBuilder sb = new StringBuilder(16, 16);	//Initialize with length = capacity = 16... the max length of the string

void Start(){
	format = GarbageFreeString(sb);
}

Depending on what you’re doing with string, you may need to do one last hack. I’m using mine in GUI, so I have to set the text value to another value (I use “”) and set it back to the string you extracted with GarbageFreeString whenever the value changes. This is just a hack to get the TextMesh or GUIText or whatever you’re using to update itself, since sometimes just setting the text value to the same string reference doesn’t invalidate the GUI.

So there you have it. Garbage-free strings for Unity at long last! I wonder how long before they update their version of Mono to 3.0 or 4.0 compatibility (whichever one fixed the garbage generation in these functions) and this all becomes obsolete. My bet is next month 😉

Edit: The last version of this edit was misinformed. Unity is still on .NET 2.0 and thus this workaround is still needed.
Edit 2: Current versions of uGUI don’t update correctly using this technique. I haven’t looked too far into a fix, but basically the issue is that setting .text = thestring, doesn’t trigger an invalidate. I’d love to get this fixed again, but alas I’ve reverted to string.Format. 🙁

~ by Schoen on May 28, 2014.

14 Responses to “Eureka! Garbage-free string manipulation in Unity!”

  1. Hi, this stopped working on the new .net a while ago.
    The .net class no longer works in this way, it just won’t work.

    Still works on the xbox360 (hence those articles you cite being stiff valid) and indeed I use it on there.

    Unity 5 has now been released, any chance of an update on this article to the “now”?

    🙂 many thanks

    • Hey, thanks for the comment! I will certainly be making some updates, and I did see this coming, but haven’t had the time to really test it out in Unity 5. Our game that takes advantage of this is still being built out of Unity 4. As far as I know, the new version of .NET >4.0 actually do all string operations without generating garbage, so I think the update is just “don’t bother!”

  2. As far as I can tell, the note you included:

    EDIT: Now that Unity 5 uses a newer version of .NET, StringBuilder.ToString() has finally been fixed! A StringBuilder will still create garbage when its size increases, but if you instantiate it with enough space, or don’t get rid of your stringbuilders, so they don’t grow very often, you can do all the appending, printing, and resetting you want with no garbage! Unity 4 and below will still require the hack. Enjoy!

    Is not true in Unity 5. Append of non string types, AppendFormat and ToString all cause allocations.

    • I think you’re right. I’ve since been informed that the slide I was referencing from the Unite Keynote was actually talking about improvements with IL2CPP, which don’t apply to all platforms, and was in reference to future changes, not current fixes. Turns out this workaround is still needed.

  3. I had tried your solution here and while StringBuilder extension work flawlessly by itself, the UI update does not.

    If you use this code of yours:

    private string format;
    private StringBuilder sb = new StringBuilder(16, 16);

    myText.text = format;

    Unity UI won’t update because the reference to the string does not change. Any attempt to change the content of the string yields nothing. You either need to rebuild the whole object (like calling TextGenerator.Update()) or flip between two different strings.

  4. I must say that these extensions are very very nice. They have helped me reduce GCs in my project. And having garbage free format methods is just great. Good job!

  5. Hi, if you don’t want to pay the price of reflection everytime you call GarbageFreeString then I would suggest you look into compiling your method call into an expression tree *once* at start up time, you will pay the reflection price *once* then, and the rest will be a simple price of a method call. See http://stackoverflow.com/questions/3332268/creating-an-expression-tree-that-calls-a-method.

  6. If anyone stumbles onto this, know that it works with that little extra trick for uGUI:

    Duplicate the Text object, toggle which is used each frame, and set the unused one text to an empty string.

    Thanks a lot for the garbage free string and sb.concat stuff !

  7. Unity5.4.1f1(64bit)

    myString = GarbageFreeString(sb);
    Causes this error when trying to Debug.Log(myString);
    ArgumentException: invalid utf-16 sequence at 2281312 (missing surrogate tail)
    Parameter name: string

    Sometimes I can get past this problem and I’ll get strange additions appended after my string; also I have not pinpointed exactly whats happening but the strings must be returning wrong sometimes because code that works with ToString will fail to work properly in these instances.

  8. Nevermind, it’s not important due to manipulating strings in general causes GC; I found the only way to manipulate strings is on another thread.

  9. Yo, great post. I know I’m way late, but I thought I’d mention that I was also getting garbage as a result of editing StringBuilder characters in-place (which I think goes against the whole principle of using a StringBuilder in the first place.)

    To comment on your second edit, I was able to get text to update by calling Text.cachedTextGenerator.Invalidate () after using your trick of assigning the text an empty string value. Hopefully this works for you, too!

    Cheers.

  10. I have a garbage free stringbuilder class here.
    https://github.com/willmotil/MonoGameUtilityClasses

    I wrote a wrapper that basically has a overload for every numerical append type and added in a bunch of operator overloads so it pretty much works were ever a stringbuilder would work.

    Only thing i never got around to adding is the formatting but i suppose you might be able to yourself.

  11. Every numerical type that gets ToStringed implicitly or explicitly generates garbage unfortunately.
    I already posted this issue on user voice long ago but it goes pretty deep down im not sure it will ever get fixed.If you pull out the vector overloads in my class however you can use it as straight c# code.

  12. Regarding:

    Edit 2: Current versions of uGUI don’t update correctly using this technique. I haven’t looked too far into a fix, but basically the issue is that setting .text = thestring, doesn’t trigger an invalidate. I’d love to get this fixed again, but alas I’ve reverted to string.Format. 🙁

    The workaround for this is to set .text = “”; then .text = theString; The issue is that the .text property compares the string references, realizes they are the same and doesn’t dirty the UI.

    Also, aside from maybe some adjustments, I still got this to work on 5.4 iirc haven’t tried it on any new versions or against >2017.x, .Net 4.x or compared with the StringBuilder in IL2CPP though.
    I guess whenever you read this, profile the target device first with all different readily available solutions to see if one of them works GC free before you reimplement stuff. Also profile that solution on the target platforms again afterwards to verify that it solves the issue.

Leave a Reply

Your email address will not be published. Required fields are marked *