[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: wide characters and i18n
On Fri, 16 Jul 2010 16:50:12 +0100
Sad Clouds <cryintothebluesky%googlemail.com@localhost> wrote:
> 2. The interfaces for C library multi-byte to wide, and wide to
> multi-byte conversion functions are so badly designed, it's not even
> funny. The biggest problem with those functions is the fact they expect
> NULL terminated strings. If you have a partial (not NULL terminated)
> string in the buffer, you cant call string conversion function on it,
> because it won't stop until it finds a NULL and you end up with buffer
> overrun. You cannot "artificially" NULL terminate the string, because
> after reading NULL char, the function will reset mbstate_t object to the
> initial state. This will mess up the next sequence of multi-byte
> characters if the encoding had state.
> I spent two days, jumping through the hoops and trying to figure out
> how to convert partial strings. I think I nailed it in the end with 30%
> performance penalty, but still 3.5 times faster than iconv().
> If anyone is interested, I can post the code for the wrapper
In case it can serve, I also wrote an implementation of UTF-8 <->
UTF-32 and put it under BSD-like license:
I however have no benchmark comparing it against an other implementation.
Main Index |
Thread Index |