NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: bin/59766: awk does not handle RS="\0"
The following reply was made to PR bin/59766; it has been noted by GNATS.
From: RVP <rvp%SDF.ORG@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc:
Subject: Re: bin/59766: awk does not handle RS="\0"
Date: Tue, 18 Nov 2025 05:55:32 +0000 (UTC)
On Mon, 17 Nov 2025, Martin Neitzel via gnats wrote:
> (2) awk's RS has a special meaning when it "is NULL": paragraphs
> (separated by empty lines) become the records, lines the fields.
> This was historically new with nawk ("the one true awk"), and
> POSIX demands it, too:
> [...]
> An RS="" is the canonical way to set this within a script, and
> I'd assume an RS="\0" to act not any different.
>
Yeah. You'd have to distinguish between `RS=""', `RS="\0"', `RS="[\0]"', at least.
Possibly also, `RS="\0\0"', `RS="\0\0\0"', etc. Then too, assigning to RS from
another variable. Doesn't look simple w/o major reworking of awk's innards.
But, then, there's another way to do this right now: use multi-char. (ie. regex)
delimiters:
```
$ find . -type f -exec printf '%s__DelimiTEr__' {} + |
awk -vRS=__DelimiTEr__ '{ printf ">%s<\n", $0 }'
$ printf '%s__DelimiTEr__' $'hello\n\nworld\n\n' '1' '?' '!' '$$' '*' |
awk -vRS=__DelimiTEr__ '{ printf ">%s<\n", $0 }'
```
Any unique string would do (SHA512 hashes, UUID strings, ...).
-RVP
Home |
Main Index |
Thread Index |
Old Index