Subject: Re: inn 2.1 dies
To: Jukka Marin <jmarin@pyy.jmp.fi>
From: Greg A. Woods <woods@most.weird.com>
List: port-sparc
Date: 01/23/1999 13:51:49
[ On Sat, January 23, 1999 at 12:16:50 (+0200), Jukka Marin wrote: ]
> Subject: Re: inn 2.1 dies
>
> innd is not setuid.. so I guess this patch wouldn't help. :-/

You never know until you try it!  (There can be many reasons for a core
dump to fail, as the patch indicates.)

> A weird problem.. sounds like some fatal error because innd isn't able
> to log any error messages.. but the only fatal things I can think of are
> running out of memory OR being killed because of using too much CPU time
> (and I haven't limited CPU usage, so..).

The system is supposed to log notifications of processes that are killed
using killproc(), that is when the system wants to send SIGKILL to a
process.  Unfortunately resource usage kills are not logged, nor are NFS
kills due to modified text segments.  The patches appended will fix
those ommisions.  In -current, with UVM, running out of swap space will
cause a SIGKILL, but it is preceded by a kernel printf() which should
eventually be logged, assuming syslog keeps running, and on a sparc
should even be in the message buffer after a reboot.

(Which reminds me -- there are a lot of calls to psignal(p, 16) sprawled
throughout the kernel -- they should probably be psignal(p, SIGURG).)

Of course inn may just be silently calling exit() when you're not
looking.  You might want to peek through its code to see if this is
possible.

> Feels stupid to write a script to monitor whether innd is alive or not.
> (Besides, running a script that forks commands on SS1+ can eat quite a
> lot of valuable CPU time, sigh..)

Indeed (on both counts!).

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>      <robohack!woods>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>

Index: kern/kern_synch.c
===================================================================
RCS file: /cvs/NetBSD/src/sys/kern/kern_synch.c,v
retrieving revision 1.1.1.4
diff -u -r1.1.1.4 kern_synch.c
--- kern/kern_synch.c	1998/11/16 21:30:05	1.1.1.4
+++ kern/kern_synch.c	1999/01/23 18:32:49
@@ -615,7 +615,7 @@
 	rlim = &p->p_rlimit[RLIMIT_CPU];
 	if (s >= rlim->rlim_cur) {
 		if (s >= rlim->rlim_max)
-			psignal(p, SIGKILL);
+			killproc(p, "exceeded RLIMIT_CPU");
 		else {
 			psignal(p, SIGXCPU);
 			if (rlim->rlim_cur < rlim->rlim_max)
Index: nfs/nfs_bio.c
===================================================================
RCS file: /cvs/NetBSD/src/sys/nfs/nfs_bio.c,v
retrieving revision 1.1.1.3
diff -u -r1.1.1.3 nfs_bio.c
--- nfs/nfs_bio.c	1998/11/16 21:34:50	1.1.1.3
+++ nfs/nfs_bio.c	1999/01/23 18:34:02
@@ -1044,8 +1044,7 @@
 			  np->n_lrev != np->n_brev) ||
 			 (!(nmp->nm_flag & NFSMNT_NQNFS) &&
 			  np->n_mtime != np->n_vattr->va_mtime.tv_sec))) {
-			uprintf("Process killed due to text file modification\n");
-			psignal(p, SIGKILL);
+			killproc(p, "process text file was modified");
 			p->p_holdcnt++;
 		}
 		break;