On Wed, Jun 28, 2023 at 12:45:55PM -0400, Mouse wrote: > >>> "\ddd", where ddd is a one, two, or three-digit octal number, shall > >>> be written as a byte with the numeric value specified by the octal > >>> number." > >> [...] > > I beg to differ: since due to this very unfortunate "variable length" > > feature, your scanner has to read char by char, it can reject the > > third digit since it would yield an out of range byte value. > Is the size of a `byte' specified anywhere? Not in C, which allows just about anything (incl. your DSP byte=char=short=int=32 bits example), but POSIX defines bytes as 8-bit; quoth Issue 8 Draft 3 (this is long-standing, I just have that PDF open), XRAT A.3: 123777 Byte 123778 The restriction that a byte is now exactly eight bits was a conscious decision by the standard 123779 developers. It came about due to a combination of factors, primarily the use of the type int8_t 123780 within the networking functions and the alignment with the ISO/IEC 9899: 1999 standard, 123781 where the intN_t types were first defined. 123782 According to the ISO/IEC 9899: 1999 standard: 123783 The [u]intN_t types must be two’s complement with no padding bits and no illegal values. 123784 All types (apart from bit fields, which are not relevant here) must occupy an integral 123785 number of bytes. 123786 If a type with width W occupies B bytes with C bits per byte (C is the value of 123787 {CHAR_BIT}), then it has P padding bits where P+W=B∗C. 123788 Therefore, for int8_t P=0, W=8. Since B≥1, C≥8, the only solution is B=1, C=8. 123789 The standard developers also felt that this was not an undue restriction for the current state-of- 123790 the-art for this version of the standard, but recognize that if industry trends continue, a wider 123791 character type may be required in the future. And similarly XBD, <limits.h> says 10172 {CHAR_BIT} 10173 Number of bits in a type char. 10174 CX Value: 8 (where "CX" shading indicates "Extension to the ISO C standard"). Funnily, one place in the teletype definitions still uses "bits per byte" instead of "bits per character" as a historical artifact. uudecode is defined as undefined if the encoder and decoder have different byte widths. Best, наб
Attachment:
signature.asc
Description: PGP signature