What Does NSString’s Length Method Really Return?

You would think that it returns the number of characters in the NSString, but it doesn’t. Instead, it returns the number of unichars (1 or 2 byte characters). So as far as -[NSString length] is concerned every 4 byte unicode character (the double flat, for instance) counts as 2 characters even though it’ll only be displayed as one character on screen.

This isn’t a bug per-say in -[NSString length], but it’s not the behavior I was expecting. To count the number of characters in NSString use the following function adapted from AGRegex:
[code lang=”cpp”]
int utf8charcount(NSString *string) {
char *str = [string UTF8String];
int chars, pos, len = [str length];
unsigned char c;

for (pos = chars = 0; pos < len; pos++) { c = str[pos]; if (c <= 0x7f || (0xc0 <= c && c <= 0xfd)) chars++; } return chars; } [/code]