What Does NSString’s Length Method Really Return?
Categories: Cocoa
You would think that it returns the number of characters in the NSString, but it doesn’t. Instead, it returns the number of unichars
(1 or 2 byte characters). So as far as -[NSString length]
is concerned every 4 byte unicode character (the double flat, for instance) counts as 2 characters even though it’ll only be displayed as one character on screen.
This isn’t a bug per-say in -[NSString length]
, but it’s not the behavior I was expecting. To count the number of characters in NSString use the following function adapted from AGRegex:
[code lang=”cpp”]
int utf8charcount(NSString *string) {
char *str = [string UTF8String];
int chars, pos, len = [str length];
unsigned char c;
for (pos = chars = 0; pos < len; pos++) { c = str[pos]; if (c <= 0x7f || (0xc0 <= c && c <= 0xfd)) chars++; } return chars; } [/code]