Subject: Re: daily crashes with 1.6.1
To: None <current-users@NetBSD.ORG>
From: Tim Middleton <x@Vex.Net>
List: current-users
Date: 07/06/2003 01:00:49
On July 4, 2003 05:57 pm, Greg A. Woods wrote:
> I'd start swapping systems (whole systems if possible -- just keep the
> disks; memory, motherboards, cables, and controllers otherwise), and see

This is our current plan of attack. We have another box down there and were 
going to make it into an NFS server... physically move the NFS mounted drive 
there... and also other NFS mounted paritions on a new drive there. We're 
going to run an older version of NetBSD there on which we never had problems. 
Hopefully this will at least show us if it is NFS serving that is the 
problem.

However, one of these partitions is mail. And i'm horribly afraid of NFS 
locking issues with multiple NFS clients reading and writing... anyone have 
any opinions on whether my fears are unwarranted?

> It could even be the power supplies.  If those machines are like the

Power supplies. Ugh. I have power problems here at home, and bad/weak power 
supplies do indeed cause a raft of bizarre behavior to show up... it is a 
possibility perhaps. But again, a coincidence that the power would start 
causing these problems just when we upgraded. I'll check with Ken as to what 
we have in there.

> Is this a real crash, a silent reboot, or a hard hang?

We weren't entirely sure, to tell the truth. No one had seen it happen, and we 
were remote (our top priority getting the machine back up). And there has 
been no core dumps. However, as luck (?!) would have it I was down there just 
last night working on it when it hung.... 

Everything locked up completely. The box would not respond to ping, keyboard 
was dead, etc. There were no kernel messages on the monitor at all... just 
the usual miscellaneous syslog noise... and then... just frozen.

> If it's a hard hang can you get into ddb on the console?

I would have tried last night... but as I describe above. Everything just 
freezes. Which does being to smell of hardware. But the coincidence of it 
still seems too much for me.

> Are the "wake-on-LAN" features all completely disabled in the BIOS for

I believe they are disabled completely. When we were having scsi controller 
problems i went through the bios and made sure everything was off... i'm 
pretty sure its still off.

Interesting theory though. (-: Anyhow we are full-duplex to the switch. And as 
described the whole box locks...

-- 
Tim Middleton | Cain Gang Ltd | Christianity didn't [...] for 20 centuries
x@veX.net     | www.Vex.Net   | [...] shit Hallmark before a live studio 
audience.