tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: printf(1), sh(1), POSIX.2 and octal escape sequences



Am 28.06.2023 um 12:57 schrieb tlaronde%polynum.com@localhost:
> But isn't it incorrect? POSIX 2018 says:
>
> '"\ddd", where ddd is a one, two, or three-digit octal number, shall be
> written as a byte with the numeric value specified by the octal number.'

The main intended takeaway from this sentence is that \0000 is not a
single escape sequence but rather the escape sequence '\000' followed by
the digit '0'. That's a different to the hex escape sequence introduced
by C90, which allows an arbitrary number of digits, so '\x0000000012'
forms a single escape sequence.

That sentence defines that '\778' is parsed as '\77' followed by the
digit '8', as '8' is not an octal digit.

That sentence also says that '\777' is parsed as a single escape
sequence (due to the common lexer rule that at each time, the longest
possible token is matched), as '777' is a syntactically valid octal
number. The range constraints are usually not expressed in the grammar,
they are left to another layer of the parser or interpreter instead.

So '\778' should be parsed as '\77' followed by '8', and '\777' should
be parsed as '\777' and then rejected as out of range, just like a port
number 70000 is rejected as well.

Roland



Home | Main Index | Thread Index | Old Index