Subject: Re: CVS commit: src/sys
To: M. Warner Losh <imp@bsdimp.com>
From: Matthew Orgass <darkstar@city-net.com>
List: tech-kern
Date: 04/26/2004 02:52:20
On 2004-04-25 imp@bsdimp.com wrote:
> In message: <20040425044912.94DFEA655@zen.crufty.net>
>             sjg@crufty.net (Simon J. Gerraty) writes:
> : cp += snprintf(...)
>
> Maybe people shouldn't do that at all.  No need to have a cp +=
> safe_sprintf() because the code is still wrong in the buffer overflow
> case: if it returns 0, future uses of cp will be wrong.
>
> Better to say
>
>        snprintf()
>        if (cp + strlen(cp) > ep)
> 		/* handle badness */
>
> So what you propose would fix the 'negativge offset' issue, but not
> deeper issues.

  IMO, strlen should be entirely banished from the kernel.  There is no
reason why the kernel should ever need to sit and count bytes just to
figure out how long a string is.  Lots of things can go much faster if you
don't need to look for a single null character, but strlen is the worst
since you are not even doing anything with the data.

  But your point is certainly correct: in overflow case in a cp += ...
sequence the last character will be overwritten serveral times (since -1
is returned for size of 0).  Even if that doesn't cause serious problems
in most message output, it should be avoided.  There is another bug in the
proposed safe_snprintf (and my snprintf_seq): the terminating null
character would be included in the amount returned, so you would need to
either decrement the pointer between calls or have the function return the
size not counting the terminating null, which would make the obvious full
condition test not quite right.  With a string buffer structure, it would
be obvious what is going on.

  I like the fprintf idea as a general kernel addition that would make it
possible to print to generic targets to eliminate the need for temporary
buffers, but if it is added it should IMO be expected to be used somewhat
frequently, not just to fix a few improper snprintf calls, especially in
cases where the code is shared.

  It is... interesting to look at the examples individually.  The ahc and
ahd cases are the same debugging code.  The snprintf is completely
unnecessary, since it is just printed out immediately anyway.  The
snprintf is not in the FreeBSD version, so I would guess this was part of
private modifications to have it do something other than printf that never
got committed.  It origionally used sprintf(line+strlen(line), ...).

  The usb and pci code do the same thing, which is to copy the device
names to a string so the driver can print it out.  It looks like all uses
are printf or aprintf_*, so if the interface specified a pointer to a
printf-style function, snprintf and the extra stack usage would be
unnecessary.  Actually, all the aprintfs are aprintf_normal, so they could
also just aprintf_normal the code (which would look funny in a QUIET boot
if the driver otherwise used printf, but it should not be difficult to
modify the drivers to use aprintf if anyone cares).  The current code
should never overflow the buffer, although putting the buffer on the stack
could easily help overflow the kernel stack, particularly in case of
drivers that call config_found and might go configure other stuff before
returning (unless the compiler can reduce the stack after the printf call,
although I don't think it can since it was passed as an argument to the
devinfo function).

  This leaves the two cases in the mpt code.  They are passed to
mpt_print, which... calls printf, first printing the device name.  The
current code will never overflow, since snprintf is called only twice and
the first time is a short fixed length.  It's buffer is actually a static
variable inside the function.  This snprintf usage would be reasonable if
commented, although it would also not be too difficult to avoid the use of
the buffers, and I don't like the idea of using bss for temporary buffers
(though I will admit that the debug text and rodata use much more space.
mpt_debug is not conditionally compiled.  However, code tends to be
copied...).

  After writing the above I looked at the kernel kprintf code, and the
kernel snprintf actually does what snprintf(9) says it does, which is
return the number of bytes written.  Except that the null character is not
included unless the buffer overflows and -1 is returned if the passed size
is 0.  So the new code is mostly safe in the kernel, at least concerning
snprintf for nonvital message output, and will at worst truncate and
overwrite the last character a few times (due to the -1 return).  It is
still possible/likely that putting large buffers on the stack will
contribute to stack overflow, so I think they should be modified to call
printf or aprintf_normal directly (excpet possibly mpt, which can be left
alone).  A bitmask_printf would also be nice to avoid kernel stack usage
:).  Below are diffs for printf.3 and kprintf.9 (however not including the
"panic if buffer too small" idea, which does make sense in the kernel
IMO).

Matthew Orgass
darkstar@city-net.com

Index: lib/libc/stdio/printf.3
===================================================================
RCS file: /cvsroot/src/lib/libc/stdio/printf.3,v
retrieving revision 1.38
diff -u -r1.38 printf.3
--- lib/libc/stdio/printf.3	8 Sep 2003 17:54:32 -0000	1.38
+++ lib/libc/stdio/printf.3	26 Apr 2004 05:34:26 -0000
@@ -111,8 +111,12 @@
 are converted for output.
 .Pp
 These functions return
-the number of characters printed
-(not including the trailing
+the number of characters printed or, for
+.Fn snprintf
+and
+.Fn vsnprintf ,
+the number of characters that would have been written to a sufficiently sized
+buffer (in all cases not including the trailing
 .Ql \e0
 used to end output to strings).
 If an output error was encountered, these functions shall return a
@@ -790,6 +794,20 @@
 .Fn asprintf
 interfaces are not yet portable.
 .Pp
+Because the return value of
+.Fn snprintf
+may be larger than the amount of data actually written, it is not safe to write
+code of the form:
+.Bd -literal -offset indent
+char buffer[MAXLEN], *cp = buf, *ep = &buf[MAXLEN];
+cp = snprintf(buffer, MAXLEN, "%s", string1);
+cp += snprintf(buffer, ep - cp, "%s", string2);
+
+.Ed
+If the buffer is not large enough to hold the first string, ep - cp will be
+negative, which will be interpreted as a large integer length and will cause the
+buffer to overflow.
+.Pp
 It is important never to pass a string with user-supplied data as a
 format without using
 .Ql %s .

Index: share/man/man9/kprintf.9
===================================================================
RCS file: /cvsroot/src/share/man/man9/kprintf.9,v
retrieving revision 1.17
diff -u -r1.17 kprintf.9
--- share/man/man9/kprintf.9	16 Apr 2003 13:35:29 -0000	1.17
+++ share/man/man9/kprintf.9	26 Apr 2004 06:39:09 -0000
@@ -176,6 +176,11 @@
 .Fn vsnprintf
 functions return the number of characters placed in the buffer
 .Fa buf .
+If the buffer is too small, the return value is equal to
+.Fa size .
+If
+.Fa size
+is 0 then -1 is returned.
 .Sh SEE ALSO
 .Xr printf 1 ,
 .Xr printf 3 ,
@@ -215,6 +220,15 @@
 .Fn aprint_debug
 functions first appeared in
 .Bsx .
+.Sh CAVEATS
+Large buffers for
+.Fn snprintf
+should not be placed on the stack.
+.Pp
+Note that the return value of
+.Fn snprintf
+is different than the libc equivalent, but should still only be used for testing
+if the buffer was large enough.
 .Sh BUGS
 The
 .Fn uprintf