What Does NSString’s Length Method Really Return?
Categories: Cocoa
You would think that it returns the number of characters in the NSString, but it doesn’t. Instead, it returns the number of unichars (1 or 2 byte characters). So as far as -[NSString length] is concerned every 4 byte unicode character (the double flat, for instance) counts as 2 characters even though it’ll only be displayed as one character on screen.
This isn’t a bug per-say in -[NSString length], but it’s not the behavior I was expecting. To count the number of characters in NSString use the following function adapted from AGRegex:
[code lang=”cpp”]
int utf8charcount(NSString *string) {
	char *str = [string UTF8String];
	int chars, pos, len = [str length];
	unsigned char c;
for (pos = chars = 0; pos < len; pos++) { c = str[pos]; if (c <= 0x7f || (0xc0 <= c && c <= 0xfd)) chars++; } return chars; } [/code]