tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: vi vs. nvi



>>>>> On Mon, 11 Aug 2008 01:29:40 +0200,
      Lubomir Sedlacik <salo%Xtrmntr.org@localhost> said:

>> According to the source code (*3), it seems OpenSolaris doesn't use
>> strcoll(3)/wcscoll(3), and always compares character code values,
>> although I may be missing something.

> Here goes:
> 
> OS                        LANG           CODESET(*1) result of regexec(3)
> ------------------------- -------------- ----------  --------------------
> Solaris 7                 en_US          ISO8859-1   not match
> Solaris 7                 en_US.UTF-8    UTF-8       match
> Solaris 8                 en_US          ISO8859-1   not match
> Solaris 8                 en_US.UTF-8    UTF-8       match
> Solaris 9                 en_US          ISO8859-1   not match
> Solaris 9                 en_US.UTF-8    UTF-8       match
> Solaris 10 FCS            en_US          ISO8859-1   not match
> Solaris 10 FCS            en_US.UTF-8    UTF-8       not match
> Solaris 10 Update 6 (*)   en_US          ISO8859-1   match
> Solaris 10 Update 6 (*)   en_US.UTF-8    UTF-8       match
> Solaris Nevada b91        en_US          ISO8859-1   match
> Solaris Nevada b91        en_US.UTF-8    UTF-8       match
> OpenSolaris 2008.05 + b94 en_US          ISO8859-1   match
> OpenSolaris 2008.05 + b94 en_US.UTF-8    UTF-8       match
> 
> (*) Not sure when exactly between FCS and U6 this changed.  I could
>     track it down to a patch number later if you want to know.

Hmm, thanks.
So I must miss something, and newer Solaris (including OpenSolaris)
always use collation order for range expressions even with Latin-1. ;-/

BTW, it seems the following "unspecified behavior" was introduced
at SUSv3:

>>>>> On Fri, Aug 08, 2008 at 09:10:30PM +0900, SODA Noriyuki said:
> http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html#tag_09_03_05
> 9.3.5 RE Bracket Expression
> In the POSIX locale, a range expression represents the set of
> collating elements that fall between two elements in the collation
> sequence, inclusive. In other locales, a range expression has
> unspecified behavior: strictly conforming applications shall not rely
> on whether the range expression is valid, or on the set of collating
> elements matched.

SUSv2 requested to use collation order without any exception:

    http://www.opengroup.org/onlinepubs/007908799/xbd/re.html

    A range expression represents the set of collating elements that
    fall between two elements in the current collation sequence,
    inclusively. It is expressed as the starting point and the ending
    point separated by a hyphen (-).

    Range expressions must not be used in portable applications
    because their behaviour is dependent on the collating
    sequence. Ranges will be treated according to the current
    collating sequence, and include such characters that fall within
    the range based on that collating sequence, regardless of
    character values. This, however, means that the interpretation
    will differ depending on collating sequence.

And maybe this change between SUSv2 and SUSv3 was made for compatibility
with Linux, because there were the following technical reports about
conflicts between SUS and the Linux Standard Base:

    http://www.opengroup.org/personal/ajosey/tr28-07-2003.txt
    http://www.opengroup.org/personal/ajosey/tr11-11-2005.txt

    Range expression (such as [a-z]) can be based on code point order
    instead of collating element order.

And there is such specification in LSB:

    
http://refspecs.freestandards.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic.html

    19.2. Regular Expressions
    Range expression (such as [a-z]) can be based on code point order
    instead of collating element order.

>>>>> On Mon, 11 Aug 2008 07:55:13 +1000,
      Daniel Carosone <dan%geek.com.au@localhost> said:

> If so, then please, please, please let's not do that.

I interpret your request as "Please make NetBSD behave like Linux
instead of Solaris". :-)
And I think that's certainly better at least at first.
-- 
soda


Home | Main Index | Thread Index | Old Index