NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: bin/57253: xargs wraps lines after ~4k characters



The following reply was made to PR bin/57253; it has been noted by GNATS.

From: Marc Daniel Fege <marc%fege.net@localhost>
To: gnats-bugs%netbsd.org@localhost, Robert Elz <kre%munnari.oz.au@localhost>
Cc: rvp%sdf.org@localhost, netbsd-bugs%netbsd.org@localhost
Subject: Re: bin/57253: xargs wraps lines after ~4k characters
Date: Thu, 02 Mar 2023 16:23:26 +0100

 > In the case of xargs, POSIX actually requires it:
 >=20
 > 	The generated command line length shall be the sum of the size
 > 	in bytes of the utility name and each argument treated as strings,
 > 	including a null byte terminator for each of these strings.
 
 Just for understandig: how would it break any compatibility pushing limits=
 =20
 beyond than those set now? I cannot forsee see a shell script written=20
 according to the POSIX manuals would break when limits would be extended wh=
 ich=20
 would never exceeded anyway by an algorithm that was during development tim=
 e=20
 unable to run successfully beyond a limit set at that time. The only side=20
 effect I see is scripts written beyond the limits would now accidentally=20
 becomming functional. No harm here.
 
 > If you chase enough definitions in XCU 1.2 (Utility Limits) you'll
 > discover that the minimum value for LINE_MAX is 2048.  ARG_MAX is
 > defined in the specification of <limits.h> in XSH 14.   That ends up
 > being 4096 (minimum).
 
 Meaning: minimum is not something that must not exceeded per se.
 
 > Interfaces volume of POSIX.1-202x) shall not exceed {ARG_MAX}-2048
 > bytes. Within this constraint, if neither the -n nor the -s option
 > is specified, the default command line length shall be at least
 > {LINE_MAX}.
 
 However, if POSIX strictly demands to set those limits you mentioned, why i=
 s=20
 "kern.argmax" set not to one of those limits you mentioned earlier but beyo=
 nd?=20
 I guess, userland programs need to chase up.
 
 > If you want to write a portable script, you need to expect those limits
 > on some systems.
 
 Absolutely true. It is as we see really messy to write portable shell scrip=
 ts=20
 even only with tools --may they POSIX compliant or not -- behave differentl=
 y=20
 due to a botchy standard (e.g. 'wc'). According to my testings, between=20
 openSUSE, FreeBSD, NetBSD and MacOS, NetBSD has the narrowest limits to=20
 process one big line on the shell, as was mentioned in this feed earliear.=
 =20
 which is pitty not for me but for the people use NetBSD (as their daily=20
 driver).
 
 >  I doubt our xargs limits anything to 4K, I think we
 > get quite close to the implementation limit (much more than you are
 > entitled to reply upon) - but do note that what is in the environment
 > counts, if you have lots of stuff there, xargs will make shorter commands
 > (less args) then if you have less in the environment.   Both the number
 > of environment variables, and the length of each of those (NAME=3Dvalue)
 > matters.
 
 As rvp revealed, it is not xargs per se, but /bin/echo, which creates \n af=
 ter=20
 a certain chunk of data, until the buffer is empty.
 
 > Both the number
 > of environment variables, and the length of each of those (NAME=3Dvalue)
 > matters.
 
 Unfortunately yes. They are not allocated dynamically up to the architectur=
 e=20
 maximum, which would be desireble to have nowadays, but set somewhere for a=
 =20
 reason of the past.
 
 Please contemplate my invokes about "wc" and this issue here once again.
 
 Sincerely
 Marc.
 
 Am Donnerstag, 2. M=E4rz 2023, 14:24:25 CET schrieb Robert Elz:
 >     Date:        Thu,  2 Mar 2023 11:45:02 +0000 (UTC)
 >     From:        Marc Daniel Fege <marc%fege.net@localhost>
 >     Message-ID:  <20230302114502.386241A923B%mollari.NetBSD.org@localhost>
 >=20
 >   |  Indeed, some limit will be there anyway. But do those limits need to=
  be
 >   |  (artificially) defined in the userland programs themselves to handle=
  an
 >   |  otherwise comming up exception of the programming language or librar=
 y?
 >=20
 > In the case of xargs, POSIX actually requires it:
 >=20
 > 	The generated command line length shall be the sum of the size
 > 	in bytes of the utility name and each argument treated as strings,
 > 	including a null byte terminator for each of these strings. The
 > 	xargs utility shall limit the command line length
 >=20
 > "shall" in POSIX means it is an absolute requirement - that must happen.
 >=20
 > 	such that when the command line is invoked, the combined argument
 > 	and environment lists (see the exec family of functions in the=20
 System
 > 	Interfaces volume of POSIX.1-202x) shall not exceed {ARG_MAX}-2048
 > 	bytes. Within this constraint, if neither the -n nor the -s option
 > 	is specified, the default command line length shall be at least
 > 	{LINE_MAX}.
 >=20
 > If you chase enough definitions in XCU 1.2 (Utility Limits) you'll
 > discover that the minimum value for LINE_MAX is 2048.  ARG_MAX is
 > defined in the specification of <limits.h> in XSH 14.   That ends up
 > being 4096 (minimum).
 >=20
 > If you want to write a portable script, you need to expect those limits
 > on some systems.   I doubt our xargs limits anything to 4K, I think we
 > get quite close to the implementation limit (much more than you are
 > entitled to reply upon) - but do note that what is in the environment
 > counts, if you have lots of stuff there, xargs will make shorter commands
 > (less args) then if you have less in the environment.   Both the number
 > of environment variables, and the length of each of those (NAME=3Dvalue)
 > matters.
 >=20
 > kre
 
 



Home | Main Index | Thread Index | Old Index