NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: system goes unresponsive



I think I have confirmed my suspicion that the problem is related to
daily maintenance, which is scheduled (the default) at 0315.

Having carefully fsck'd all my file systems (300 GByte worth) and
cleared /tmp manually, I ran "sh -v /etc/daily" and more or less
immediately got the same result, at:
        rm -f $TMP $TMP2
fi

I've temporarily commented daily maintenance out of crontab to see
what happens tonight.

I regularly get soft disk errors and DMA downgrades, but I always have
- I'm running a pair of 152GByte Maxtor 6L160P0 drives.

This system owes me nothing, but Christmas is coming and replacing my
core file store, mail server and name server at short notice is not
something I can easily do.   So any further suggestions as to how I
might make do meanwhile would be most welcome.

--
Steve Blinkhorn <steve%prd.co.uk@localhost>

You wrote:
> 
> On Mon, Dec 17, 2012 at 4:59 AM, Steve Blinkhorn <steve%prd.co.uk@localhost> 
> wrote:
> > I have an i386 machine running NetBSD 4.0.1 that has run consistently
> > for several years.   There have been no recent changes to any part of
> > the configuration: it provides the backbone of my local network, name
> > service, file service etc. etc.
> >
> > A few days ago it started crashing, or so I thought, overnight.   It
> > was almost unresponsive, but not quite.   It would issue a login
> > prompt on a virtual terminal on its console, and echo the login but
> > not issue the password prompt.
> >
> > Since we had been the subject of a substantial DoS attack from China,
> > I assumed that was the problem, added a couple of extra rules on my
> > router firewall, but next morning same problem.
> >
> > So I thought, maybe there'se some form of attack that's causing the
> > system to run out of processes.   So last night I left top(1) running on
> > a virtual terminal.   This morning there was the same problem, but top
> > was still updating regularly and showing the system as essentially
> > 100% idle, with ample free memory and swap space, and only 75
> > processes (which is about baseline for this machine).
> >
> > I can't find any panic message in the logs, but from the absence of
> > the normal rhythm of log entries, it seems that the problem occurs
> > sometime around 0315, which strikes me as significant in terms of
> > daily housekeeping.
> >
> > Help in diagnosing this problem would be much appreciated.   I think
> > it's something very basic that I just haven't run into as a problem
> > before.
> 
> When it comes to older i386 hardware (I'm assuming older since you're
> using 4.x), I'm pessimistic.
> 
> 0315 is probably related to daily maintenance.
> 
> Machine hangs in NetBSD/i386 are overwhelmingly related to bad
> hardware in my experience.
> 
> Once it's getting stressed by daily maintenance, it's giong bad.
> 
> Check the smart status on the hard drives (assuming you're using IDE drives):
> 
> atactl wdX smart status
> 
> Look for bad errors.
> 
> Check the dmesg. Do a memory test. It's probably related to memory or
> a hard drive. Or maybe a marginal power supply.
> 
> Andy
> 



Home | Main Index | Thread Index | Old Index