Subject: bin/19832: /bin/sh has internationalization issues
To: None <gnats-bugs@gnats.netbsd.org>
From: Martin Husemann <martin@duskware.de>
List: netbsd-bugs
Date: 01/13/2003 10:23:40
>Number:         19832
>Category:       bin
>Synopsis:       /bin/sh has internationalization issues
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Jan 13 01:24:00 PST 2003
>Closed-Date:
>Last-Modified:
>Originator:     Martin Husemann
>Release:        NetBSD 1.6L
>Organization:
>Environment:
System: NetBSD night-porter.duskware.de 1.6L NetBSD 1.6L (PORTER) #0: Sat Jan 4 12:45:09 CET 2003 martin@insomnia.duskware.de:/usr/src/sys/arch/i386/compile/PORTER i386
Architecture: i386
Machine: i386
>Description:

The /bin/sh code uses two magic constants generated by mksyntax: PEOF and UPEOF.
They need to be identical, but PEOF seems to be an integer, while UPEOF needs
to be the same value but as a char. UPEOF is never used directly, but PEOF
is and some macros (generated by mksyntax too) test against UPEOF.

PEOF is used as a out-of-band character in a zero terminated char* buffer, for
example to mark the end of a here-document. This means PEOF must be != '\0'
and no valid character inside a here document.

No such character exists, IMHO.

The arbitrary value chosen for PEOF right now is a valid printable character
in some locales on machines where unsigned chars are used if char == unsigned
char. It's an non printable/non alpha character if char == signed char in all
locales I know of, but there is no guarantee for this property - and I'm not
sure if this would forbid the character to occur in here documents.

>How-To-Repeat:
code inspection

>Fix:
Rotottile the code passing lengths around instead of relying on sentinels?
Maybe do it completely and make it multi-byte character safe?
>Release-Note:
>Audit-Trail:
>Unformatted: