Re: A draft for a multibyte and multi-codepoint C string interface

To: tech-userlevel%NetBSD.org@localhost
Subject: Re: A draft for a multibyte and multi-codepoint C string interface
From: Steffen "Daode" Nurpmeso <sdaoden%gmail.com@localhost>
Date: Tue, 02 Apr 2013 17:09:12 +0200

Mouse <mouse%Rodents-Montreal.ORG@localhost> wrote:
 |Interpreting octets - or chars - as characters is a human-interface
 |thing, and I think it should stay at the human-interface layer.

Maybe the draft wasn't too clear; i guess the interface should be
splitted into a tc* and a tg* family, i.e., codepoint-wise and
graphem-wise, where the former simply represents a replacement for
what the current *w* family does, but without the byte->wide->byte
round-trip, plus some additions to de/normalize UTF-8 input, and
to verify input correctness on explicit request, but not knowing
about combining etc.  boundaries, whereas the latter will
automatically understand those, too.

The internal implementation can handle both cases rather
unchanged, it's just a matter of what kind of PEEKBOUND is used
(and where this is not true, it should be changed), given that no
automatic conversion should be applied, which is not what
i propose.
Of course this tg* family will not work for the initial
implementation, which rather doesn't know anything about the data
that it is working on.  It's just a draft in the end.

 |/~\ The ASCII                           Mouse
 |\ / Ribbon Campaign
 | X  Against HTML              mouse%rodents-montreal.org@localhost
 |/ \ Email!         7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

Thanks,

--steffen

References:
- A draft for a multibyte and multi-codepoint C string interface
  - From: Daode
- Re: A draft for a multibyte and multi-codepoint C string interface
  - From: Mouse
- Re: A draft for a multibyte and multi-codepoint C string interface
  - From: Daode
- Re: A draft for a multibyte and multi-codepoint C string interface
  - From: Mouse

Prev by Date: Re: posix shared memory
Next by Date: Re: A draft for a multibyte and multi-codepoint C string interface
Previous by Thread: Re: A draft for a multibyte and multi-codepoint C string interface
Next by Thread: unified man page
Indexes:

Home | Main Index | Thread Index | Old Index