tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [PATCH] Support for mbsnrtowcs and wcsnrtomb



SODA Noriyuki wrote:
> It seems the description in the OpenGroup specification has a problem
> about this point.

You really should file a austin-bug-report then.


> I guess this is because mbsnrtowcs() was glibc
> extension originally, and OpenGroup just copied the glibc specification.
> Note that glibc doesn't support stateful encodings, but ours does.

Well, I am not that sure "Glibc does not support stateful encoding" when
I read http://austingroupbugs.net/view.php?id=616.
It's a sequel of http://austingroupbugs.net/view.php?id=601

Basically, this (already accepted, and already implemented) added
interpretation requires that if the input buffer ends with an
unterminated character, then the implementation should consume the
available part, and record within the mbstate_t object all the needed
information to be able to restart directly at the end of the buffer:
this very much seems stateful encoding to me (although not as complex as
ISO-2022-*.)


I notice that this Austin-group interpretation botched the C99/C11
description for the value to be returned in src when an EILSEQ occurs:
in such a case, under the C99/C11 Standard you can reset the mbstate_t
--since it's now undefined-- and then restart some process from the
updated *src, which holds the pointer "past the last converted
multibyte", ie a pointer to the start of still unconverted part.
Under the Austin-group reading *src is to be updated to "last byte
processed" which can be anything since the process detected an error;
the potential for restarting is now close to 0. Worse, you cannot know
exactly where you can insert a \0 to transform the original input string
into a valid one.
I understand implementations followed the same ideas as Austin Group
commentators and probably did not fully observe the requirements of the
C99/C11 Standard (thus botching the value returned in *src.)
I also understand that the overwhelming majority of programs using that
functions just abort when EILSEQ is detected.


Antoine


Home | Main Index | Thread Index | Old Index