tech-security archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: strscpy



> Date: Sat, 30 May 2020 08:52:39 +0200
> From: Maxime Villard <max%m00nbsd.net@localhost>
> 
> Le 29/05/2020 à 21:34, Taylor R Campbell a écrit :
> > We shouldn't use the same name if we're implementing different
> > semantics.
> > 
> > Do we actually need it to return the length?
> 
> Not returning it would be widening the difference in semantics, but
> see below:

Yes, I am suggesting that if we change the semantics at all -- even
just to change the return values -- we should avoid reusing the name.

> > There are over 5000 strlcpy calls in tree.  At most 250 of them (many
> > of these false positives) use the return value at all.
> > 
> > I searched through the cases that might use the return value to find
> > how any of them use it.  Every case I found, except for the definition
> > of strlcat and a libevent regression test for strlcpy itself, uses the
> > return value at most to detect truncation and for no other purpose.
> > (sockaddr_format returns whatever strlcpy returns, but nothing uses
> > its return value.)

Correction -- I made an error in the grep patterns for that search
(which I didn't write down and have since forgotten); I searched again
and found few of instances of strlcpy that use the returned length
nontrivially, with the following grep pattern:

	git grep -w strlcpy | grep -v '^[^:]*:[ 	]*strlcpy' | grep -v '(void) *strlcpy' | grep -v 'if (strlcpy' | grep -Ev '^[^:]*/(configure|configure.ac|CMakeLists.txt|Makefile|Makefile.am|Makefile.in|config.h|ChangeLog(.[0-9]{4})?|NEWS|CommitLog):'

Some of them -- e.g., bin/pax/tar.c, ustar_rd; libexec/ftpd/conf.c,
format_path; usr.bin/ftp/ftp.c, gunique -- appear to use strlcpy under
the misapprehension that returns the length (in non-NUL bytes) of the
output string, not the length of the input string as strlcpy actually
returns.  (It could be that there is context from which one can prove
truncation is not possible in these cases, but it's not obvious in a
cursory audit.)

Of course, if strwhatevercpy is _guaranteed_ to return the length of
the output string, we can't use a sentinel value -- whether -1 or
SIZE_MAX or size or size+1 like kre suggested -- to indicate
truncation; an idiom like

	char dst[128];
	size_t n;

	n = strwhatevercpy(dst, src1, sizeof(dst));
	strwhatevercpy(dst + n, src2, sizeof(dst) - n);

would fail, and it would fail only in the possibly rare event of
truncation and potentially only with a one- or two-byte buffer
overrun.  On the other hand, the strlcpy man page already recommends
the slightly different idiom

	n = strlcpy(dst, src1, sizeof(dst));
	if (n >= sizeof(dst))
		goto fail;
	strlcpy(dst + n, src2, sizeof(dst) - n);

The same idiom would still work with strwhatevercpy as a drop-in
replacement for strlcpy if we used SIZE_MAX, size, or size+1 as
truncation indicator.

- Returning size (or size+1) might be safer than strlcpy.  But it has
  the (to my mind) counterintuitive property that the result is
  sometimes the length _without_ the NUL terminator and the result is
  sometimes the size _with_ the NUL terminator (or with the NUL
  terminator and one more, for size+1).

- Using SIZE_MAX avoids the casts or ssize_t needed by -1, and means
  if you try to use the length in the truncation case, you will almost
  certainly cause a fault that will abort the application or panic the
  kernel -- in contrast to a one- or two-byte buffer overrun of size
  or size+1 which is much less likely to be noticed.

All that said, I think we should go through an exercise of converting
various different uses of strlcpy to any proposed semantics to see if
it makes sense for them and/or fixes bugs and/or improves legibility.


Home | Main Index | Thread Index | Old Index