NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: bin/57253: xargs wraps lines after ~4k characters



> In the case of xargs, POSIX actually requires it:
> 
> 	The generated command line length shall be the sum of the size
> 	in bytes of the utility name and each argument treated as strings,
> 	including a null byte terminator for each of these strings.

Just for understandig: how would it break any compatibility pushing limits 
beyond than those set now? I cannot forsee see a shell script written 
according to the POSIX manuals would break when limits would be extended which 
would never exceeded anyway by an algorithm that was during development time 
unable to run successfully beyond a limit set at that time. The only side 
effect I see is scripts written beyond the limits would now accidentally 
becomming functional. No harm here.

> If you chase enough definitions in XCU 1.2 (Utility Limits) you'll
> discover that the minimum value for LINE_MAX is 2048.  ARG_MAX is
> defined in the specification of <limits.h> in XSH 14.   That ends up
> being 4096 (minimum).

Meaning: minimum is not something that must not exceeded per se.

> Interfaces volume of POSIX.1-202x) shall not exceed {ARG_MAX}-2048
> bytes. Within this constraint, if neither the -n nor the -s option
> is specified, the default command line length shall be at least
> {LINE_MAX}.

However, if POSIX strictly demands to set those limits you mentioned, why is 
"kern.argmax" set not to one of those limits you mentioned earlier but beyond? 
I guess, userland programs need to chase up.

> If you want to write a portable script, you need to expect those limits
> on some systems.

Absolutely true. It is as we see really messy to write portable shell scripts 
even only with tools --may they POSIX compliant or not -- behave differently 
due to a botchy standard (e.g. 'wc'). According to my testings, between 
openSUSE, FreeBSD, NetBSD and MacOS, NetBSD has the narrowest limits to 
process one big line on the shell, as was mentioned in this feed earliear. 
which is pitty not for me but for the people use NetBSD (as their daily 
driver).

>  I doubt our xargs limits anything to 4K, I think we
> get quite close to the implementation limit (much more than you are
> entitled to reply upon) - but do note that what is in the environment
> counts, if you have lots of stuff there, xargs will make shorter commands
> (less args) then if you have less in the environment.   Both the number
> of environment variables, and the length of each of those (NAME=value)
> matters.

As rvp revealed, it is not xargs per se, but /bin/echo, which creates \n after 
a certain chunk of data, until the buffer is empty.

> Both the number
> of environment variables, and the length of each of those (NAME=value)
> matters.

Unfortunately yes. They are not allocated dynamically up to the architecture 
maximum, which would be desireble to have nowadays, but set somewhere for a 
reason of the past.

Please contemplate my invokes about "wc" and this issue here once again.

Sincerely
Marc.

Am Donnerstag, 2. März 2023, 14:24:25 CET schrieb Robert Elz:
>     Date:        Thu,  2 Mar 2023 11:45:02 +0000 (UTC)
>     From:        Marc Daniel Fege <marc%fege.net@localhost>
>     Message-ID:  <20230302114502.386241A923B%mollari.NetBSD.org@localhost>
> 
>   |  Indeed, some limit will be there anyway. But do those limits need to be
>   |  (artificially) defined in the userland programs themselves to handle an
>   |  otherwise comming up exception of the programming language or library?
> 
> In the case of xargs, POSIX actually requires it:
> 
> 	The generated command line length shall be the sum of the size
> 	in bytes of the utility name and each argument treated as strings,
> 	including a null byte terminator for each of these strings. The
> 	xargs utility shall limit the command line length
> 
> "shall" in POSIX means it is an absolute requirement - that must happen.
> 
> 	such that when the command line is invoked, the combined argument
> 	and environment lists (see the exec family of functions in the 
System
> 	Interfaces volume of POSIX.1-202x) shall not exceed {ARG_MAX}-2048
> 	bytes. Within this constraint, if neither the -n nor the -s option
> 	is specified, the default command line length shall be at least
> 	{LINE_MAX}.
> 
> If you chase enough definitions in XCU 1.2 (Utility Limits) you'll
> discover that the minimum value for LINE_MAX is 2048.  ARG_MAX is
> defined in the specification of <limits.h> in XSH 14.   That ends up
> being 4096 (minimum).
> 
> If you want to write a portable script, you need to expect those limits
> on some systems.   I doubt our xargs limits anything to 4K, I think we
> get quite close to the implementation limit (much more than you are
> entitled to reply upon) - but do note that what is in the environment
> counts, if you have lots of stuff there, xargs will make shorter commands
> (less args) then if you have less in the environment.   Both the number
> of environment variables, and the length of each of those (NAME=value)
> matters.
> 
> kre




Home | Main Index | Thread Index | Old Index