tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [PATCH] Support for mbsnrtowcs and wcsnrtomb



On Fri, Apr 26, 2013 at 10:15:17AM +0200, Antoine LECA wrote:
> +int
> +_citrus_ctype_mbsnrtowcs_fallback(_citrus_ctype_rec_t * __restrict cc,
> +    wchar_t * __restrict pwcs, const char ** __restrict s, size_t in,
> +    size_t n, void * __restrict psenc, size_t * __restrict nresult)
> [...]
> +     err = 0;
> +     cnt = 0;
> +     se = *s + in;
> +     s0 = *s; /* to keep *s unchanged for now, use copy instead. */
> +     while (s0 < se && n > 0) {
> +             err = _citrus_ctype_mbrtowc(cc, pwcs, s0, (size_t)(se - s0),
> +                 psenc, &siz);
> +             if (siz == (size_t)-2)
> +                     err = EILSEQ;
> 
> How can this be correct?
> 
> if mbrtowc returns -2, it means there is not enough characters
> (remaining in the buffer) to complete a conversion; however this is not
> an error condition (yet); and clearly there is no "illegal sequence". In
> such a case, mbs[n]rtowcs should stop at this point, leaving the rest of
> the string for a further ("restartable") call; this is achieved by
> returning *s=s0, i.e. updating the passed-in pointer to point to the
> still-to-convert string; in other words, just
>                       goto bye;
> does the job here.

I'm not sure about that reading of the standard. OpenGroup says
conversion stops early:

"A sequence of bytes is encountered that does not form a valid
character."

If mbrtowc returned -2, the remaining part of the string is clearly not
a valid character (yet). I'm not saying you are wrong, just that I am
not sure of the exact meaning here. This applies to mbsrtowcs as well.

Joerg


Home | Main Index | Thread Index | Old Index