Subject: server locking up
To: None <current-users@netbsd.org>
From: Mark Davies <mark@mcs.vuw.ac.nz>
List: current-users
Date: 07/06/2006 23:27:09
One of our file servers (running 3.99.11) has started locking up every couple
of days. It gets into a state where any process will run fine until it tries
to access the disk at which point it stops responding. Updating the kernel
to a current from a couple of weeks ago makes no difference.
I've had zero luck in tracking down whats causing this.
However I set up an external-mode watchdog to panic the machine if a loop of
sleep 20; ls -l /a/local/directory > /dev/null ; wdogctl -t
failed to tickle the watchdog for a minute, so I now have a core dump from
such a panic. I'd like some suggestions on what to look for/how to poke at
this core dump to try to find whats happening.
cheers
mark