NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/56121: struct kinfo_lwp is inconsistent in NUL-termination of its char arrays, ps assumes all are NUL-terminated!



>Number:         56121
>Category:       kern
>Synopsis:       struct kinfo_lwp is inconsistent in NUL-termination of its char arrays, ps assumes all are NUL-terminated!
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Apr 22 01:20:00 +0000 2021
>Originator:     Greg A. Woods
>Release:        NetBSD 9.99.81
>Organization:
Planix, Inc.; Kelowna, BC; Canada
>Environment:
System: NetBSD xentastic 9.99.81 NetBSD 9.99.81 (XEN3_DOM0) #14: Wed Apr 21 12:25:35 PDT 2021 woods@xentastic:/build/woods/xentastic/current-amd64-amd64-obj/build/src/sys/arch/amd64/compile/XEN3_DOM0 amd64
Architecture: x86_64
Machine: amd64
>Description:

	I wondered why the WCHAN column in ps(1) output was truncated,
	and thought I might widen it to the widest (struct
	kinfo_lwp).l_wmesg string and quickly found out this field is
	not currently a string, but just a fixed size character array,
	and is filled with strncpy() from an identically sized array in
	the kernel, which is itself not NUL terminated.

	I then noticed that ps(1) passes a pointer to l_wmesg to its
	internal strprintorsetwidth() which (when also called with
	WIDTHMODE) promptly passes that same pointer to strlen().

	I tried my idea to change ps(1) to print an unlimited-width
	WCHAN column, only to see garbage characters, proving there was
	something quite wrong.  (see the final diff hunk below)

	A the moment there's a guaranteed NUL byte at the end of struct
	kinfo_lwp (i.e. in l_name), courtesy strlcpy(), so nothing
	terrible happens.

	I would suggest that since the l_name field already has a
	padded-out width, and since it is already filled with strlcpy()
	and thus always NUL terminated, and since there are pad fields
	just before l_wmesg, that l_wmesg also be padded (replacing one
	of the pad fields), AND that it also be filled with strlcpy().

>How-To-Repeat:

	by reading code, and testing changes

>Fix:

--- sys/sys/sysctl.h.~1.231.~	2021-03-07 17:24:18.000000000 -0800
+++ sys/sys/sysctl.h	2021-04-21 17:41:09.797635650 -0700
@@ -373,7 +373,7 @@
  */
 #define	KI_NGROUPS	16
 #define	KI_MAXCOMLEN	24	/* extra for 8 byte alignment */
-#define	KI_WMESGLEN	8
+#define	KI_WMESGLEN	12	/* extra for NUL and 8 byte alignment */
 #define	KI_MAXLOGNAME	24	/* extra for 8 byte alignment */
 #define	KI_MAXEMULLEN	16
 #define	KI_LNAMELEN	20	/* extra 4 for alignment */
@@ -566,8 +566,7 @@
 	uint8_t l_usrpri;		/* U_CHAR: User-priority based on l_cpu and p_nice. */
 	int8_t	l_stat;			/* CHAR: S* process status. */
 	int8_t	l_pad1;			/* fill out to 4-byte boundary */
-	int32_t	l_pad2;			/* .. and then to an 8-byte boundary */
-	char	l_wmesg[KI_WMESGLEN];	/* wchan message */
+	char	l_wmesg[KI_WMESGLEN];	/* wchan message (with NUL) */
 	uint64_t l_wchan;		/* PTR: sleep address. */
 	uint64_t l_cpuid;		/* LONG: CPU id */
 	uint32_t l_rtime_sec;		/* STRUCT TIMEVAL: Real time. */
@@ -575,7 +574,7 @@
 	uint32_t l_cpticks;		/* INT: ticks during l_swtime */
 	uint32_t l_pctcpu;		/* FIXPT_T: cpu usage for ps */
 	uint32_t l_pid;			/* PID_T: process identifier */
-	char	l_name[KI_LNAMELEN];	/* CHAR[]: name, may be empty */
+	char	l_name[KI_LNAMELEN];	/* CHAR[]: name, may be empty (with NUL) */
 };
 
 /*
--- sys/kern/init_sysctl.c.~1.227.~	2021-03-07 17:23:04.000000000 -0800
+++ sys/kern/init_sysctl.c	2021-04-21 18:04:42.321587746 -0700
@@ -1576,7 +1576,7 @@
 	kl->l_priority = lwp_eprio(l);
 	kl->l_usrpri = l->l_priority;
 	if (l->l_wchan)
-		strncpy(kl->l_wmesg, l->l_wmesg, sizeof(kl->l_wmesg));
+		strlcpy(kl->l_wmesg, l->l_wmesg, sizeof(kl->l_wmesg));
 	COND_SET_VALUE(kl->l_wchan, PTRTOUINT64(l->l_wchan), allowaddr);
 	kl->l_cpuid = cpu_index(l->l_cpu);
 	bintime2timeval(&l->l_rtime, &tv);


This change isn't necessary (note it does not and should not be changed
to use strlcpy()!), but I think it is a much safer coding practice and
more consistent with related code:


--- kern_proc.c.~1.262.~	2021-03-07 17:23:06.000000000 -0800
+++ kern_proc.c	2021-04-21 18:01:35.169135140 -0700
@@ -2692,7 +2692,7 @@
 		KASSERT(l != NULL);
 		lwp_lock(l);
 		if (l->l_wchan)
-			strncpy(ep->e_wmesg, l->l_wmesg, WMESGLEN);
+			strncpy(ep->e_wmesg, l->l_wmesg, sizeof(ep->e_wmesg));
 		lwp_unlock(l);
 	}
 	ep->e_ppid = p->p_ppid;





This final diff hunk is what I was trying to achieve, and with the
kernel fixes it works, but without them it prints garbage in the wchan
field:  (note KI_LNAMELEN includes the NUL so maybe lname() should be
similarly fixed)


--- bin/ps/print.c.~1.132.~	2020-05-30 15:08:18.000000000 -0700
+++ bin/ps/print.c	2021-04-21 17:58:09.223885510 -0700
@@ -936,7 +936,8 @@
 	v = ve->var;
 	if (l->l_wmesg[0]) {
 		strprintorsetwidth(v, l->l_wmesg, mode);
-		v->width = min(v->width, KI_WMESGLEN);
+		if (termwidth != UNLIMITED)
+			v->width = min(v->width, 8); /* xxx or KI_WMESGLEN-1? */
 	} else {
 		if (mode == PRINTMODE)
 			(void)printf("%-*s", v->width, "-");



Home | Main Index | Thread Index | Old Index