1 Mar 2007 09:13
Re: CVS commit: src/sys/dev/usb
Dieter Baron <dillo <at> danbala.tuwien.ac.at>
2007-03-01 08:13:56 GMT
2007-03-01 08:13:56 GMT
hi, > > Please note that fs/unicode.h does not handle UTF-16 surrogates > > correctly. What's worth, the API does not allow this to be fixed. > > > > (Unicode defines more characters than fit in a 16 bit int. In > > UTF-16, a character with a code above 0xffff is represented as two > > surrogate values. In UTF-8, it is encoded as a 5 byte sequence. > > Encoding/decoding one 16 bit value at a time does not allow for this > > conversion to be done correctly.) > > Please feel free to suggest ways that this should be fixed. Patches are > best! > > We all would like better unicode handling, and AFAIK no one is wedded to > the existing interface. Okay, here is what I currently use in the HFS+ implementation (netbsd-soc.cvs.sf.net:/cvsroot/netbsd-soc hfs/hfsp/unicode.[ch]): #define UNICODE_DECOMPOSE 0x01 /* convert to decomposed NF */ #define UNICODE_PRECOMPOSE 0x02 /* convert to precomposed NF */ size_t utf8_to_utf16(uint16_t *out, size_t outlen, const char *in, size_t inlen, int flags, int *errcountp); Converts the UTF-8 string IN to UTF-16 and stores at most OUTLEN words in OUT. FLAGS may be one of the above to convert the string to normal form during conversion. if ERRCOUNTP is non-NULL, the number(Continue reading)
RSS Feed