Subject: kern/23554: STABLE system locks
To: None <gnats-bugs@gnats.netbsd.org>
From: None <kefren@netbastards.org>
List: netbsd-bugs
Date: 11/24/2003 10:15:10
>Number:         23554
>Category:       kern
>Synopsis:       lock in -STABLE(1.6_RC1)
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Nov 24 08:16:00 UTC 2003
>Closed-Date:
>Last-Modified:
>Originator:     Mihai Chelaru
>Release:        NetBSD 1.6.2_RC1
>Organization:
	
None.
>Environment:
	
	
System: NetBSD xxx.xxx.xxx 1.6.2_RC1 NetBSD 1.6.2_RC1 (Kefren) #6: Sat No
v 22 15:24:03 EET 2003     root@xxx.xxx.xxx:/usr/src/sys/arch/i38
6/compile/Kefren i386
Architecture: i386
Machine: i386
>Description:
	About once per day the system locks. The only thing I can do from console is to enter ddb. This machine is doing web proxy, nat, ipsec (relativly high traffic).
	Non-usual things: bumped kern.mbuf.nmbclusters to 4096(I haven't had problems of this kind with 1024 but the usual mbufs exceeded message).
	Here is an output from ddb:

	Stopped at      cpu_Debugger+0x4:       leave
db> ps
 PID             PPID       PGRP        UID S   FLAGS          COMMAND    WAIT
 4387            4376       4360          0 3  0x4084           chatel   netio
 4376            4373       4360          0 3  0x4084             bash    wait
 4375            3363        264          0 3  0x4184         sendmail  select
 4373            4372       4360          0 3  0x4084             bash    wait
 4372               1       4360          0 3    0x84          runospf nanosle
 4354             264        264          0 3    0x84             cron  piperd
 3391            3378       3378       1004 3  0x4084             mail  piperd
 3389            3378       3378       1004 3  0x4084            cvsup  select
 3378            3370       3378       1004 3  0x4084               sh    wait
 3370             264        264          0 3    0x84             cron  piperd
 3363             264        264          0 3    0x84             cron    wait
 270              187        187       1003 3   0x184            httpd   netio
 269                1        269          0 3  0x4086            getty   ttyin
 268                1        268          0 3  0x4086            getty   ttyin
 267                1        267          0 3  0x4086            getty   ttyin
 266                1        266          0 2  0x4086            getty
 264                1        264          0 3    0x84             cron nanosle
 259                1        259          0 3    0x84            inetd  select
 246                1        246          0 3    0x84            sshd2  select
 227                1        227          0 3    0x84             ntpd   pause
 215                1         10       1002 3    0x86    postNetServer  netcon
 214              199         10       1006 3    0x86         postgres   netio
 208              187        187       1003 3   0x184            httpd   lockf
 207              187        187       1003 3   0x184            httpd   lockf
 206              187        187       1003 3   0x184            httpd   lockf
 205              187        187       1003 3   0x184            httpd   lockf
 204              187        187       1003 3   0x184            httpd  select
 202              201         10       1006 3    0x86         postgres  select
 201              199         10       1006 3    0x86         postgres  select
 199                1         10       1006 3  0x4086         postgres  select
 187                1        187          0 3    0x84            httpd  select
 185              174        185      32767 3  0x4084          unlinkd  piperd
 184              174        184      32767 3  0x4084       squidGuard   netio
 183              174        183      32767 3  0x4084       squidGuard   netio
 182              174        182      32767 3  0x4084       squidGuard   netio
 181              174        181      32767 3  0x4084       squidGuard   netio
 180              174        180      32767 3  0x4084        dnsserver  select
 179              174        179      32767 3  0x4084        dnsserver  select
 178              174        178      32767 3  0x4084        dnsserver  select
 177              174        177      32767 3  0x4084        dnsserver  select
 176              174        176      32767 3  0x4084        dnsserver  select
 174              171        171          0 3  0x4184            squid  select
 171                1        171          0 3    0x84            squid    wait
 169                1        169          0 3    0x84            named  select
 167                1        167          0 3    0x84           racoon  select
 122                0          0          0 3 0x20204        acctwatch  actwat
 91                 1         91          0 3    0x84          syslogd  select
 9                  0          0          0 3 0x20204         aiodoned aiodone
 8                  0          0          0 3 0x20204          ioflush  syncer
 7                  0          0          0 3 0x20204           reaper  reaper
 6                  0          0          0 3 0x20204       pagedaemon pgdaemo
 5                  0          0          0 3 0x20204             pms0 pmsrese
 4                  0          0          0 3 0x20204        atapibus0  sccomp
 3                  0          0          0 3 0x20204         scsibus1  sccomp
 2                  0          0          0 3 0x20204         scsibus0  sccomp
 1                  0          1          0 3  0x4084             init    wait
 0                 -1          0          0 3 0x20204          swapper schedul
 4360            4354       4360          0 5  0x6000               sh
db> cont
Stopped at      cpu_Debugger+0x4:       leave
db> reboot
syncing disks... fatal page fault in supervisor mode
trap type 6 code 0 eip c01c496d cs 8 eflags 10202 cr2 fc cpl 0
panic: trap
Begin traceback...
trap() at trap+0x202
--- trap (number 6) ---
genfs_putpages(e4bd5440,c155e8c4,c01afc16,c155e8c4,e58f5c3c) at genfs_putpages+0
x239
ffs_putpages(e4bd5440,c14e4500,c1309300,e4bd5450) at ffs_putpages+0x11d
ffs_full_fsync(e4bd5538,0,e4bd548c,c01c33f8,e58f5c3c) at ffs_full_fsync+0xc6
ffs_fsync(e4bd5538,10012,10,1) at ffs_fsync+0x3c
ffs_sync(c15b5e00,2,c1309f00,c031adc0) at ffs_sync+0x10a
sys_sync(c031adc0,0,0,c01bc160,0) at sys_sync+0x5a
vfs_shutdown(0,10,e4bd560c,c0181c49,74) at vfs_shutdown+0x6a
cpu_reboot(0,0,e4bd561c,c0180a75,c02a3d00) at cpu_reboot+0x3b
db_reboot_cmd(1,0,e4bd5670,e4bd5654,0) at db_reboot_cmd+0x51
db_command(c02e5a34,c02a3d00,e4bd571c,c0180369,c02a3f8b,e4bd5718,e4bd571c,c01803
39) at db_command+0x214
db_command_loop(c022bc1c,e4bd5748,e4bd575c,c0235cb6) at db_command_loop+0x8b
db_trap(1,0,e4bd578c,c022bb46,1,0,c1b6a600,c1505098) at db_trap+0x11c
kdb_trap(1,0,e4bd57e4,c1b6a600) at kdb_trap+0x116
trap() at trap+0x177
--- trap (number 1) ---
cpu_Debugger(c14cdd40,4b0,c1652f00,c0258278,c14cdc80) at cpu_Debugger+0x4
comintr(c1304800) at comintr+0xf4
Xintr4() at Xintr4+0x7e
--- interrupt ---
ip_natout(c16bbb28,e4bd596c,e4bd596c,c16bbb00,c16bbb28) at ip_natout+0x562
fr_check(c16bbb28,14,c14d802c,1,e4bd5a38) at fr_check+0x5f7
gcc2_compiled.(0,e4bd5a38,c14d802c,2,3c) at gcc2_compiled.+0x72
pfil_run_hooks(c031f8a0,e4bd5abc,c14d802c,2,c155e948) at pfil_run_hooks+0x4c
ip_output(c16bbb00,0,c031f8c4,1,0,e4bd5c88,e4bd5cac,c01f0147,c16bbb00,0,3c,1) at
 ip_output+0x708
ip_forward(c16bbb00,0,33,1,c16bbb00) at ip_forward+0x200
ip_input(c16bbb00,c02636e5,c14cdf60,e4bbf56c) at ip_input+0x3d1
ipintr(10,c17b0010,c1530010,10,e4bbf56c) at ipintr+0x6b
Xsoftnet() at Xsoftnet+0x2c
--- interrupt ---
idle(e4bbf56c,3e9,c0197948,e4bbf56c) at idle+0x1b
bpendtsleep(c031d5ac,118,c02a6b80,3e9,0) at bpendtsleep
sys_poll(e4bbf56c,e4bd5f80,e4bd5f78,c0235283) at sys_poll+0x229
syscall_plain(1f,bfbf001f,4807001f,bfbf001f,804a808) at syscall_plain+0xa7
End traceback...

dumping to dev 4,1 offset 2100389
dump 1023 1022 1021 1020 1019 1018 1017 1016 1015 1014 1013 1012 1011 1010 1009
1008 1007 1006 1005 1004 1003 1002 1001 1000 999 998 997 996 995 994 993 992 991


	Here is dmesg:

NetBSD 1.6.2_RC1 (Kefren) #6: Sat Nov 22 15:24:03 EET 2003
    root@xxx.xxx.xxx:/usr/src/sys/arch/i386/compile/Kefren
cpu0: Intel Pentium 4 (686-class), 1993.81 MHz
cpu0: D-cache 8 KB 64b/line 4-way
cpu0: L2 cache 512 KB 64b/line 8-way
cpu0: features bfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
cpu0: features bfebfbff<PGE,MCA,CMOV,FGPAT,PSE36,CFLUSH,DS,ACPI,MMX>
cpu0: features bfebfbff<FXSR,SSE,SSE2,SS,HTT,TM,B31>
total memory = 1023 MB
avail memory = 947 MB
using 6144 buffers containing 52508 KB of memory
BIOS32 rev. 0 found at 0xf0000
mainbus0 (root)
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0
pchb0: ServerWorks CMIC_LE Host (rev. 0x13)
pchb1 at pci0 dev 0 function 1
pchb1: ServerWorks CMIC_LE Host (rev. 0x00)
pci1 at pchb1 bus 128
pci1: no spaces enabled!
pchb2 at pci0 dev 0 function 2
pchb2: ServerWorks product 0x0000 (rev. 0x00)
pci2 at pchb2 bus 2
pci2: no spaces enabled!
ahc0 at pci0 dev 2 function 0
ahc0: interrupting at irq 3
ahc0: aic7899 Wide Channel A, SCSI Id=7, 16/255 SCBs
scsibus0 at ahc0: 16 targets, 8 luns per target
ahc1 at pci0 dev 2 function 1
ahc1: interrupting at irq 3
ahc1: aic7899 Wide Channel B, SCSI Id=7, 16/255 SCBs
scsibus1 at ahc1: 16 targets, 8 luns per target
vga1 at pci0 dev 3 function 0: ATI Technologies Rage XL (rev. 0x27)
pci_mem_find: void region
pci_mem_find: void region
pci_mem_find: void region
wsdisplay0 at vga1 kbdmux 1
wsmux1: connecting to wsdisplay0
bge0 at pci0 dev 4 function 0: Broadcom BCM5702X Gigabit Ethernet
bge0: interrupting at irq 5
bge0: ASIC BCM5703 A2, Ethernet address 00:0b:cd:1b:8a:3f
brgphy0 at bge0 phy 1: BCM5703 1000BASE-T media interface, rev. 2
brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FD
X, auto
Compaq product 0xa0f0 (miscellaneous system) at pci0 dev 5 function 0 not config
ured
pcib0 at pci0 dev 15 function 0
pcib0: ServerWorks CSB5 SouthBridge (rev. 0x93)
pciide0 at pci0 dev 15 function 1: ServerWorks CSB5 IDE Controller (rev. 0x93)
pciide0: bus-master DMA support present
pciide0: primary channel configured to compatibility mode
atapibus0 at pciide0 channel 0: 2 targets
cd0 at atapibus0 drive 0: <LTN486S, , YQSK> type 5 cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4
pciide0: primary channel interrupting at irq 14
cd0(pciide0:0:0): using PIO mode 4
pciide0: secondary channel wired to compatibility mode
pciide0: secondary channel interrupting at irq 15
ServerWorks OSB4/CSB5 USB (USB serial bus, interface 0x10, revision 0x05) at pci
0 dev 15 function 2 not configured
pchb3 at pci0 dev 15 function 3
pchb3: ServerWorks product 0x0225 (rev. 0x00)
pchb4 at pci0 dev 17 function 0
pchb4: ServerWorks product 0x0101 (rev. 0x03)
pci3 at pchb4 bus 2
pci3: memory space enabled
pchb5 at pci0 dev 17 function 2
pchb5: ServerWorks product 0x0101 (rev. 0x03)
pci4 at pchb5 bus 5
pci4: memory space enabled
isa0 at pcib0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0 mux 1
wskbd0: connecting to wsdisplay0
pms0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pms0 mux 0
pcppi0 at isa0 port 0x61
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
isapnp0: no ISA Plug 'n Play devices found
biomask efcd netmask efed ttymask ffef
scsibus0: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 0 lun 0: <COMPAQ, BD03685A24, HPB3> SCSI3 0/direct fixed
sd0: 34732 MB, 49855 cyl, 2 head, 713 sec, 512 bytes/sect x 71132000 sectors
sd0: sync (25.0ns offset 63), 16-bit (80.000MB/s) transfers, tagged queueing
sd1 at scsibus0 target 1 lun 0: <COMPAQ, BD03685A24, HPB3> SCSI3 0/direct fixed
sd1: 34732 MB, 49855 cyl, 2 head, 713 sec, 512 bytes/sect x 71132000 sectors
sd1: sync (25.0ns offset 63), 16-bit (80.000MB/s) transfers, tagged queueing
uk0 at scsibus0 target 15 lun 0: <COMPAQ, PROLIANT 4L6I, 1.78> SCSI2 3/processor
 fixed
uk0: async, 8-bit transfers
scsibus1: waiting 2 seconds for devices to settle...
IPsec: Initialized Security Association Processing.
boot device: sd0
root on sd0a dumps on sd0b
root file system type: ffs
stray interrupt 7
stray interrupt 7
stray interrupt 7
stray interrupt 7
stray interrupt 7; stopped logging
IP Filter: v3.4.29 initialized.  Default = pass all, Logging = enabled
wsdisplay0: screen 1 added (80x25, vt100 emulation)
wsdisplay0: screen 2 added (80x25, vt100 emulation)
wsdisplay0: screen 3 added (80x25, vt100 emulation)
wsdisplay0: screen 4 added (80x25, vt100 emulation)
Accounting started

	Kernel config or any other info may be provided.

>How-To-Repeat:
	
>Fix:
	

>Release-Note:
>Audit-Trail:
>Unformatted: