Subject: Re: "daily" reboots
To: None <phil@steelhead.cs.wwu.edu>
From: Jarle Greipsland <jarle@idt.unit.no>
List: current-users
Date: 01/14/1995 16:46:16
Phil Nelson <phil@steelhead.cs.wwu.edu> writes:
> I was wondering if anyone else is having the following problem:
>   NetBSD-1.0 + patches 0-6
>   i386/66, 8Megs mem, bt scsi
>   Reboot in the daily cron job every night.  The log in /var/log/daily.out
> shows everything up to fsck and nothing after that.   (Last line in
> the log is "checking file systems:" and no output from the fsck.)
>   Also, if I run the /etc/daily script by hand, it finishes just fine.
>   Any clues?

I've seen something similar.  It turned out that if I commented out the
'calendar' command from /etc/daily everything worked ok.  

The longer story is like this: After seeing these nightly reboots I
configured a kernel with DDB.  It invariably failed in getfsstat(),
referencing an invalid memory location.  Why?  I've set up YP on my
machine, and my /etc/passwd's '+' makes 'calendar' suck in approx 1000
nonlocal accounts.  And in order to check for a 'calendar' file in a user's
home directory, the system has to automount the user's home directory.
Now, these are distributed on a slew of hosts and disks, and amd has to
work pretty hard.  Amd also unmounts the disks if they aren't used for a
while.  My hypothesis is therefore that the getfsstat() system call doesn't
properly lock the mountlist when it does it stuff, and somehow amd manages
to unmount (or perform some other nonwanted acti) on a disk that
getfsstat() has already started processing (or got a next-pointer to, or
something like that).  I posted a message to current-users some time ago
about this problem, but since there seems to be "a whole lot of shakin'
going on" in the mount stuff right now I haven't pursued it, and besides I
didn't really mean to run calendar at my host anyway :-)

You'll have to judge for yourself if this apply to your situation, but
commenting out calendar may be an OK test, at least for one night.  On the
other hand, the successfull running of /etc/daily by hand may suggest that
you're really experiencing a different problem.

						-jarle
----
"Teaching them [the kids] how to use PC word processing programs has made me
 appreciate 'troff' more than ever."
				-- A. Tanenbaum, Distributed Operating Systems