What does char 160 mean in my source code?

Posted on

Problem :

I am formatting numbers to string using the following format string “# #.##”, at some point I need to turn back these number strings like (1 234 567) into something like 1234567. I am trying to strip out the empty chars but found that

value = value.Replace(" ", "");  

for some reason and the string remain 1 234 567. After looking at the string I found that

value[1] is 160.

I was wondering what the value 160 means?

Solution :

The answer is to look in Unicode Code Charts – where you’ll find the Latin-1 supplement chart; this shows that U+00A0 (160 as per your title, not 167 as per the body) is a non-breaking space.

char code 160 would be  

Maybe you could to use a regex to replace those empty chars:

Regex.Replace(input, @"p{Z}", "");

This will remove “any kind of whitespace or invisible separator”.

value.Replace(Convert.ToChar(160).ToString(),"")

This is a fast (and fairly readable) way of removing any characters classified as white space using Char.IsWhiteSpace:

StringBuilder sb = new StringBuilder (value.Length);
foreach (char c in value)
{
    if (!char.IsWhiteSpace (c))
        sb.Append (c);
}
string value= sb.ToString();

As dbemerlin points out, if you know you will only need numbers from your data, you would be better use Char.IsNumber or the even more restrictive Char.IsDigit:

StringBuilder sb = new StringBuilder (value.Length);
foreach (char c in value)
{
    if (char.IsNumber(c))
        sb.Append (c);
}
string value= sb.ToString();

If you need numbers and decimal seperators, something like this should suffice:

StringBuilder sb = new StringBuilder (value.Length);
foreach (char c in value)
{
    if (char.IsNumber(c)|c == System.Globalization.NumberFormatInfo.CurrentInfo.NumberDecimalSeparator )
        sb.Append (c);
}
string value= sb.ToString();

I would suggest using the char overload version:

value = value.Replace(Convert.ToChar(160), ' ') 

Solution with extended methods:

public static class ExtendedMethods
{
    public static string NbspToSpaces(this string text)
    {
        return text.Replace(Convert.ToChar(160), ' ');
    }
}

And it can be used with this code:

value = value.NbspToSpaces();

Wouldn’t be the preferred method to replace all empty characters (and this is what the questioner wanted to do) with the Regex Method which Rubens already posted?

Regex.Replace(input, @"p{Z}", "");

or what Expresso suggests:

Regex.Replace(input, @"p{Zs}", "");

The difference here is that p{Z} replaces any kind of whitespace or invisible separator whereas the p{Zs} replaces a whitespace character that is invisible, but does take up space.
You can read it here (Section Unicode Categories):

http://www.regular-expressions.info/unicode.html

Using RegEx has the advantage that only one command is needed to replace also the normal whitespaces and not only the non-breaking space like explained in some answers above.

If performance is the way to go then of course other methods should be considered but this is out of scope here.

Leave a Reply

Your email address will not be published. Required fields are marked *