Subject: Re: cron stops?
To: Bruce Anderson <brucea@shell.spacestar.net>
From: Gavan Fantom <gavan@coolfactor.org>
List: netbsd-users
Date: 12/01/2001 02:50:19
On 30 Nov 2001, Bruce Anderson wrote:

> >cron stays in this hung state ~forever - it certainly doesn't resolve
> >itself within 24 hours.
>
> This is normal for cron on most Unix systems, when a job
> is called and does not return before being called again.
> eg. it died.
> To prevent cron from hanging either fix the offending
> script or add "&" to the end of all of your crontab
> entries. eg.

This is the first thing I checked - no problem there.

On further investigation it seems as if cron itself isn't the issue, more
like some weird bug in the kernel. Network connections sending lots of
data (eg ftpd, smbd on the local network) were also hanging, almost at
random, during the same period of time in which cron was having problems.
A reboot has cleared this, both cron and network servers are once again
working happily, although I do wonder what happened in the kernel to
produce this weird behaviour.

The only thing I can think of is doing an "umount -f /sunsite", after
several days of not being able to unmount said mount (the usual loop of
processes stuck in the D state waiting forever for the fileserver to come
back up, while the fileserver never *is* going to come back up), but I'm
not sure if that happened before or after the problems started. Either
way, it'd be an interesting path to get from a forced unmount to cron
stopping at random and network connections randomly hanging, as if there
were a bug in the kernel's time handling code.

Thoughts?

It's mostly a curiosity now, since I consider it unlikely that the same
problem will reoccur, but I'm still interested in knowing what happened.

-- 
Gillette - the best a man can forget