wchar_t encoding?

To: <tech-misc%netbsd.org@localhost>
Subject: wchar_t encoding?
From: "Paul Koning" <Paul_Koning%Dell.com@localhost>
Date: Wed, 19 May 2010 11:29:38 -0400

Gents,

I'm working on a patch to gdb 7.1 to make it work on NetBSD.  The issue
is that GDB 7 uses iconv to handle character strings, and uses wide
chars internally so it can handle various non-ASCII scripts.

The trouble for NetBSD is that it asks iconv to translate to a character
set named "wchar_t".  That means "whatever the encoding is for the
wchar_t data type".  GNU libiconv supports that, so on platforms that
use that library things are fine.

NetBSD supports iconv, but it doesn't know the "wchar_t" encoding name.
So I proposed a patch that substitutes what appears to be used instead,
namely UCS-4 in platform native byte order (so "ucs-4le" on x86, for
example).  This seems to work.

The trouble is that I'm getting pushback on the patch, because of
concerns that the encoding used for wchar_t is not actually UCS-4.  In
particular, there is this article:
http://www.gnu.org/software/libunistring/manual/libunistring.html#The-wc
har_005ft-mess which says that on Solaris and FreeBSD the encoding of
wchar_t is "undocumented and locale dependent".  (Ye gods!)

Now, NetBSD is not FreeBSD... so... what is the answer for NetBSD?  Is
it like FreeBSD?  (If so, it would be good to fix that.)  Or is it a
fixed encoding, and if so, is it indeed ucs-4?

Thanks,
        paul

Follow-Ups:
- Re: wchar_t encoding?
  - From: Neil Booth
- Re: wchar_t encoding?
  - From: Valeriy E. Ushakov
- Re: wchar_t encoding?
  - From: Martin Husemann

Prev by Date: I have a list of 150k criminal attorneys in the USA
Next by Date: Re: wchar_t encoding?
Previous by Thread: I have a list of 150k criminal attorneys in the USA
Next by Thread: Re: wchar_t encoding?
Indexes:

Home | Main Index | Thread Index | Old Index