Strings and Garbage Collection

Posted on

Problem :

I have heard conflicting stories on this topic and am looking for a little bit of clarity.

How would one dispose of a string object immediately, or at the very least clear traces of it?

Solution :

That depends. Literal strings are interned per default, so even if you application no longer references it it will not be collected, as it is referenced by the internal interning structure. Other strings are just like any other managed object. As soon as they are no longer reference by your application they are eligible for garbage collection.

More about interning here in this question: Where do Java and .NET string literals reside?

If you need to protect a string and be able to dispose it when you want, use System.Security.SecureString class.

Protect sensitive data with .NET 2.0’s SecureString class

I wrote a little extension method for the string class for situations like this, it’s probably the only sure way of ensuring the string itself is unreadable until collected. Obviously only works on dynamically generated strings, not literals.

public unsafe static void Clear(this string s)
  fixed(char* ptr = s)
    for(int i = 0; i < s.Length; i++)
      ptr[i] = '';

This is all down to the garbage collector to handle that for you. You can force it to run a clean-up by calling GC.Collect(). From the docs:

Use this method to try to reclaim all
memory that is inaccessible.

All objects, regardless of how long
they have been in memory, are
considered for collection; however,
objects that are referenced in managed
code are not collected. Use this
method to force the system to try to
reclaim the maximum amount of
available memory.

That’s the closest you’ll get me thinks!!

I will answer this question from a security perspective.

If you want to destroy a string for security reasons, then it is probably because you don’t want anyone snooping on your secret information, and you expect they might scan the memory, or find it in a page file or something if the computer is stolen or otherwise compromised.

The problem is that once a System.String is created in a managed application, there is not really a lot you can do about it. There may be some sneaky way of doing some unsafe reflection and overwriting the bytes, but I can’t imagine that such things would be reliable.

The trick is to never put the info in a string at all.

I had this issue one time with a system that I developed for some company laptops. The hard drives were not encrypted, and I knew that if someone took a laptop, then they could easily scan it for sensitive info. I wanted to protect a password from such attacks.

The way I delt with it is this: I put the password in a byte array by capturing key press events on the textbox control. The textbox never contained anything but asterisks and single characters. The password never existed as a string at any time. I then hashed the byte array and zeroed the original. The hash was then XORed with a random hard-coded key, and this was used to encrypt all the sensitive data.

After everything was encrypted, then the key was zeroed out.

Naturally, some of the data might exist in the page file as plaintext, and it’s also possible that the final key could be inspected as well. But nobody was going to steal the password dang it!

There’s no deterministic way to clear all traces of a string (System.String) from memory. Your only options are to use a character array or a SecureString object.

One of the best ways to limit the lifetime of string objects in memory is to declare them as local variables in the innermost scope possible and not as private member variables on a class.

It’s a common mistake for junior developers to declare their strings ‘private string ...‘ on the class itself.

I’ve also seen well-meaning experienced developers trying to cache some complex string concatenation (a+b+c+d…) in a private member variable so they don’t have to keep calculating it. Big mistake – it takes hardly any time to recalculate it, the temporary strings are garbage collected almost immediately when the first generation of GC happens, and the memory swallowed by caching all those strings just took available memory away from more important items like cached database records or cached page output.

Set the string variable to null once you don’t need it.

string s = "dispose me!";
s = null;

and then call GC.Collect() to revoke garbage collector, but GC CANNOT guarantee the string will be collected immediately.

Leave a Reply

Your email address will not be published. Required fields are marked *