Subject: Re: Freeze again.
To: None <jmarin@pyy.jmp.fi>
From: Brad Spencer <brad@anduin.eldar.org>
List: port-i386
Date: 02/03/1998 10:31:09
   On Tue, Feb 03, 1998 at 07:20:37AM +0200, Jukka Marin wrote:
   > Last night, my i386 system froze again.  I was still able to use the
   > xterms to remote machines without problems and the i386 was still routing
   > packets between Ethernet and PPP, but I couldn't run any commands locally
   > (I think nothing needing the local disks worked).

   BTW, Amanda (yeah, it works again ;) had been able to complete a backup
   before the freeze, copying about 2 GB of data to DAT from both the SCSI
   and IDE disks, so both SCSI and IDE _do_ work to some degree.

   The system (including swap) is on the SCSI disk, with some other partitions
   on the IDE disk:

   Filesystem           1024-blocks     Used    Avail Capacity  Mounted on
   /dev/sd0a                  63471    19998    40299    33%    /
   /dev/sd0e                1015790   560426   404574    58%    /usr
   /dev/sd0f                 254063    14478   226881     6%    /var
   /dev/sd0g                 508143   225779   256956    47%    /home
   /dev/sd0h                1729044   733399   909192    45%    /empty
   /dev/wd0a                  61855    13905    44857    24%    /altroot
   /dev/wd0e                 598399        9   568470     0%    /tmp
   /dev/wd0h                1508815  1290745   142629    90%    /store
   kernfs                         1        1        0   100%    /kern

   When I hit the reset button, the machine fsck'd all the partitions and
   only /altroot had errors on it - although I haven't written anything to
   that partition for 2-3 weeks (and the partition was last fsck'd when
   the machine froze on Sunday).  All other partitions were OK (no errors
   found by fsck), although they were not marked as clean (natural after
   a reset).

   I have 512 MB of swap and I usually have at least 450 MB of that free,
   so I really don't think the machine ran out of swap (at least it _never_
   did under 1.2).

   Again, when the machine was "frozen", no disk lights or the SCSI led were
   ON (well, I couldn't see the internal disks, but as the controller leds
   were off, I think they were off too).

   Maybe it's the new vm system?  The disk drivers?  I _never_ had this kind
   of a problem under 1.2 and now I have it almost every day.

     -jm




Hello...

And here I thought it was just strange hardware setup I had.  I'v pretty
much seen this exact thing, except worse.  Everytime I tried to move to
1.3, filesystems started to get messed up.  On one particular occasion, a
filesystem was damaged so bad that fsck couldn't fix it and it had to be
newfs'ed and restored from tape.

The problem seemed to be related to swapping.  The machine in question has
3 swap partitions and 2 SCSI controllers.  There was a filesystem messed
up on every disk that had a swap partition on it.



Brad Spencer - brad@anduin.eldar.org   http://anduin.eldar.org
[finger brad@anduin.eldar.org for PGP public key]