Subject: Re: server locking up
To: Mark Davies <mark@mcs.vuw.ac.nz>
From: Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de>
List: current-users
Date: 07/06/2006 15:21:29
On Thu, Jul 06, 2006 at 11:27:09PM +1200, Mark Davies wrote:
> One of our file servers (running 3.99.11) has started locking up every couple 
> of days.  It gets into a state where any process will run fine until it tries 
> to access the disk at which point it stops responding.  Updating the kernel 
> to a current from a couple of weeks ago makes no difference.
> I've had zero luck in tracking down whats causing this.
> However I set up an external-mode watchdog to panic the machine if a loop of
> 	sleep 20; ls -l /a/local/directory > /dev/null ; wdogctl -t
> failed to tickle the watchdog for a minute, so I now have a core dump from 
> such a panic.  I'd like some suggestions on what to look for/how to poke at 
> this core dump to try to find whats happening.

Output of `ps -axl -N /netbsd -M core' is a good starting point.

-- 
Juergen Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)