Subject: 5000/200 filesystem hang
To: None <port-pmax@NetBSD.ORG>
From: Manuel Bouyer <bouyer@lix.polytechnique.fr>
List: port-pmax
Date: 06/07/1996 13:51:09
Hello,
I have some problem with the -current kernel (with the new asc driver).
The I/O operations hang. The processes still runs fine (under X-window, I
can still move the mouse, and change the focus from a window to another,
oclock and xload still get updated, etc ...), but all processes doing I/O
operations are hung. The nfs clients of this stations (A sparc/NetBSD and a 
Sparc/SunOS 4.1.4) get 'nfs server not reponding', but the network is still
running (open telnet connextions are still alive). There is no messages
in the console. 
This seems to be related to intense disk activity: a 'make -j4' kernel
compile or a compile+tar backup via nfs.

Did anyone else noticed this ? Here is the dmesg output:

stty: TIOCGETD: Operation not supported
Copyright (c) 1982, 1986, 1989, 1991, 1993
	The Regents of the University of California.  All rights reserved.

NetBSD 1.2_ALPHA (ANTIOCHE) #1: Wed Jun  5 16:19:38 MET DST 1996
    bouyer@antioche.polytechnique.fr:/usr/src/src_current/sys/arch/pmax/compile/ANTIOCHE
real mem = 50331648
avail mem = 42803200
using 1228 buffers containing 5029888 bytes of memory
mainbus0 (root)
MIPS R3000 CPU Rev. 2.0 with MIPS R3010 FPC Rev. 2.0
tcmatch: tc cmp tc
tc0 at mainbus0: 25 MHz clock
tcattach: config_found for KN02    , addr 0xbfc00000
asicmatch: KN02     slot 7 offset 0x0 pri -1
(configuring KN02 system slot as asic)
asic0 at tc0 slot 7 offset 0x0asicattach: asic0

asicattach: entry 0, base addr bfc00000
 adding dc subslot 0 offset 200000 addr bfe00000
dc0 at asic0 offset 0x200000 priority 7
asicattach: entry 1, base addr bfc00000
 adding mc146818 subslot 0 offset 280000 addr bfe80000
clock0 at asic0 offset 0x280000 priority 0
asicattach: entry 2, base addr bfc00000
asicattach: done
tcattach: config_found for PMAD-AA , addr 0xbf800000
le0 at tc0 slot 6 offset 0x0: address 08:00:2b:16:47:c2
le0: 32 receive buffers, 8 transmit buffers
tc_intr_establish: slot 6 level 2 handler 0x80034aac sc 0xc0430e00 on
tcattach: config_found for PMAZ-AA , addr 0xbf400000
asc0 at tc0 slot 5 offset 0x0tc_intr_establish: slot 5 level 1 handler 0x800ed710 sc 0xc0432c00 on
: target 7
le1 at tc0 slot 2 offset 0x0: address 08:00:2b:1b:c0:bc
le1: 32 receive buffers, 8 transmit buffers
tc_intr_establish: slot 2 level 2 handler 0x80034aac sc 0xc0430c00 on
cfb0 at tc0 slot 0 offset 0x0 (1024x864x8) (console)
looking for non-PROM console driver
calling initcpu()
autconfiguration done, spl back to 0x0
call spl0
spl0 done
Beginning old-style SCSI device autoconfiguration
rz2 at asc0 drive 2 slave 0 DEC RZ55     (C) DEC rev 1000, 649040 512 byte blocks
rz3 at asc0 drive 3 slave 0 MAXTOR P0-12S rev HB18, 1999038 512 byte blocks
rconsattach: 1 raster consoles
init: copying out path `/sbin/init' 11

Also, this kind of message is always followed by a brutal reboot (without
panic).
asc_intr: data overrun: buflen 2048 dmalen 2048 tc 1994 fifo 4
asc: asc_intr: cmd 28 bn 637084 cnt 4
asc0 tgt 2 statuZZZZ ss cc ir 8 cond 7:708 msg 0 resid 0
asc0 tgt 2 status 90 ss cc ir 20 cond 8:20 msg 12 resid 0
asc0 tgt -1 status 97 ss 9c ir c cond 16:700 msg 80 resid 0
asc0 tgt 3 status 91 ss c4 ir 10 cond 17:110 msg 12 resid 2048
asc0 tgt 3 status 93 ss cc ir 10 cond 1:310 msg 90 resid 0
asc0 tgt 3 status 97 ss cc ir 8 cond 7:708 msg 0 resid 0
asc0 tgt 3 status 90 ss cc ir 20 cond 8:20 msg 12 resid 0
asc0 tgt 3 status 0 ss 0 ir 28 cond 0:118 msg c0 resid 2048
asc0 tgt 3 status 91 ss c4 ir 18 cond 0:118 msg c2 resid 2048
asc0 tgt 3 status 93 ss cc ir 10 cond 1:310 msg 90 resid 0
asc0 tgt 3 status 97 ss cc ir 8 cond 7:708 msg 0 resid 0
asc0 tgt 3 status 90 ss cc ir 20 cond 8:20 msg 12 resid 0
asc0 tgt 3 status 0 ss 0 ir 28 cond 0:118 msg c0 resid 2048
asc0 tgt 3 status 97 ss 8c ir 18 cond 0:118 msg c2 resid 0
asc0 tgt 3 status 97 ss 8c ir 8 cond 9:708 msg 2 resid 0
asc0 tgt 3 status 97 ss 8c ir 10 cond 10:710 msg 12 resid 0
asc0 tgt 3 status 97 ss 8c ir 8 cond 9:708 msg 4 resid 0
asc0 tgt 3 status 90 ss cc ir 20 cond 15:20 msg 12 resid 0
asc0 tgt -1 status 97 ss 9c ir c cond 16:700 msg 80 resid 0
asc0 tgt 3 status 91 ss 44 ir 10 cond 17:110 msg 12 resid 2048
asc0 tgt 3 status 93 ss cc ir 10 cond 1:310 msg 90 resid 0
asc0 tgt 3 status 97 ss cc ir 8 cond 7:708 msg 0 resid 0
asc0 tgt 3 status 90 ss cc ir 20 cond 8:20 msg 12 resid 0
asc0 tgt 3 status 0 ss 0 ir 28 cond 0:118 msg c0 resid 2048
asc0 tgt 3 status 97 ss 8c ir 18 cond 0:118 msg c2 resid 0
asc0 tgt 3 status 97 ss 8c ir 8 cond 9:708 msg 2 resid 0
asc0 tgt 3 status 97 ss 8c ir 10 cond 10:710 msg 12 resid 0
asc0 tgt 3 status 97 ss 8c ir 8 cond 9:708 msg 4 resid 0
asc0 tgt 3 status 90 ss cc ir 20 cond 15:20 msg 12 resid 0
asc0 tgt -1 status 97 ss 9c ir c cond 16:700 msg 80 resid 0
asc0 tgt 3 status 91 ss 44 ir 10 cond 17:110 msg 12 resid 2048
asc0 tgt 3 status 91 ss 4c ir 10 cond 1:310 msg 90 resid 0

Any idea ?

--
Manuel Bouyer, LIX, Ecole Polytechnique
email: bouyer@lix.polytechnique.fr
--