NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: bin/59766: awk does not handle RS="\0"



The following reply was made to PR bin/59766; it has been noted by GNATS.

From: RVP <rvp%SDF.ORG@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: 
Subject: Re: bin/59766: awk does not handle RS="\0"
Date: Tue, 18 Nov 2025 05:55:32 +0000 (UTC)

 On Mon, 17 Nov 2025, Martin Neitzel via gnats wrote:
 
 > (2) awk's RS has a special meaning when it "is NULL":  paragraphs
 > (separated by empty lines) become the records, lines the fields.
 > This was historically new with nawk ("the one true awk"), and
 > POSIX demands it, too:
 > [...]
 > An   RS=""   is the canonical way to set this within a script, and
 > I'd assume an   RS="\0"  to act not any different.
 >
 
 Yeah. You'd have to distinguish between `RS=""', `RS="\0"', `RS="[\0]"', at least.
 Possibly also, `RS="\0\0"', `RS="\0\0\0"', etc. Then too, assigning to RS from
 another variable. Doesn't look simple w/o major reworking of awk's innards.
 
 But, then, there's another way to do this right now: use multi-char. (ie. regex)
 delimiters:
 
 ```
 $ find . -type f -exec printf '%s__DelimiTEr__' {} + |
      awk -vRS=__DelimiTEr__ '{ printf ">%s<\n", $0 }'
 
 $ printf '%s__DelimiTEr__' $'hello\n\nworld\n\n' '1' '?' '!' '$$' '*' |
      awk -vRS=__DelimiTEr__ '{ printf ">%s<\n", $0 }'
 ```
 
 Any unique string would do (SHA512 hashes, UUID strings, ...).
 
 -RVP
 


Home | Main Index | Thread Index | Old Index