tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Unicode programming

On Wed, 05 Oct 2011 15:51:52 -0400
Ken Hornstein <> wrote:

> - Assuming the above is correct ... what do programmers do in terms of
>   parsing things like UTF-8 into Unicode codepoints, since you don't
>   necessarily know that mbrtowc() will give you a Unicode codepoint on
>   some (looks like many) systems.  I guess iconv() looks like something
>   that handles a lot of encodings, and it seems to be lots of places;
>   I'm also aware of icu.  I'm also wondering what people do about things
>   like finding out how many columns a particular series of Unicode codepoints
>   occupies; I know about things like wcswidth(), but again you're not
>   guaranteed that wide characters are Unicode codepoints.

When doing it in C, I used a custom library
but I've not used it in some time and have recently used a higher level
language which supports unicode and already includes the conversion
facilities (and more advanced unicode features than only
encoding/decoding).  I used iconv from the shell when I needed it,
however, and remember using it from PHP (I'm not sure if that one was
PHPs or if it used libc's, though)...

Home | Main Index | Thread Index | Old Index