Subject: kern/23953: New buffer memory management crashes alpha fileserver
To: None <gnats-bugs@gnats.netbsd.org>
From: None <nathanw@mit.edu>
List: netbsd-bugs
Date: 01/02/2004 13:35:04
>Number:         23953
>Category:       kern
>Synopsis:       A kernel from after 2003-12-29 crashes under disk access
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Jan 02 18:36:00 UTC 2004
>Closed-Date:
>Last-Modified:
>Originator:     Nathan J. Williams
>Release:        NetBSD 1.6ZG, 2004-1-1
>Organization:
	Massachvsetts Institvte of Technology
>Environment:
	
	
System: NetBSD daffy-duck.hmsputnam.org 1.6ZG NetBSD 1.6ZG (DAFFY-DUCK) #0: Thu Jan  1 19:53:36 EST 2004  nathanw@marvin-the-martian.stuartst.com:/home/nathanw/work/nbsd/sys/arch/alpha/compile/DAFFY-DUCK alpha
Architecture: alpha
Machine: alpha
>Description:
	A kernel compiled from 2004-1-1 sources, after the buffer
management changes, crashes frequently under disk access, as seen here:

panic: kernel diagnostic assertion "!UVM_ET_ISSUBMAP(first_entry)" failed
: file "../../../../uvm/uvm_map.c", line 1872
Stopped in pid 8.1 (pagedaemon) at      netbsd:cpu_Debugger+0x4:        ret     
z
ero,(ra)
db> t  
cpu_Debugger() at netbsd:cpu_Debugger+0x4
panic() at netbsd:panic+0x208
__assert() at netbsd:__assert+0x3c
uvm_unmap() at netbsd:uvm_unmap+0xf4
uvm_km_free() at netbsd:uvm_km_free+0x30
bufpool_page_free() at netbsd:bufpool_page_free+0x24
pool_allocator_free() at netbsd:pool_allocator_free+0x28
pool_reclaim() at netbsd:pool_reclaim+0x188
pool_drain() at netbsd:pool_drain+0x74
uvm_pageout() at netbsd:uvm_pageout+0x354
exception_return() at netbsd:exception_return
--- root of call graph ---

The crash usually happened under moderate disk activity, such as
copying a set of files via Samba, or scanning a directory of MP3s with
music server software, or running "make depend" in a kernel compile
directory.

Backing out to 2003-12-29 kernel sources avoided the problem. It's
possible that the problem is Alpha-specific, but I don't have any
particular evidence of that as yet.

The crash usually happened within a couple of minutes of booting. The
system is:

Digital AlphaPC 164 500 MHz, s/n
8192 byte page size, 1 processor.
total memory = 256 MB
(2472 KB reserved for PROM, 253 MB used by NetBSD)
avail memory = 244 MB
mainbus0 (root)
cpu0 at mainbus0: ID 0 (primary), 21164A-2
cpu0: VAX FP support, IEEE FP support, Primary Eligible
cpu0: Architecture extensions: 1<BWX>

with the following disk systems:

cmdide0 at pci0 dev 11 function 0
cmdide0: CMD Technology PCI0646 (rev. 0x01)
cmdide0: bus-master DMA support present
cmdide0: primary channel wired to compatibility mode
cmdide0: primary channel interrupting at isa irq 14
atabus0 at cmdide0 channel 0
cmdide0: secondary channel wired to compatibility mode
cmdide0: secondary channel interrupting at isa irq 15
atabus1 at cmdide0 channel 1

wd0 at atabus0 drive 0: <IBM-DJNA-372200>
wd0: drive supports 16-sector PIO transfers, LBA addressing
wd0: 21557 MB, 43800 cyl, 16 head, 63 sec, 512 bytes/sect x 44150400 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4 (Ultra/66)
wd0(cmdide0:0:0): using PIO mode 4, DMA mode 2 (using DMA data transfers)
atapibus0 at atabus1: 2 targets
cd0 at atapibus0 drive 1: <HL-DT-ST GCE-8160B, , 1.02> cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2
wd1 at atabus1 drive 0: <Maxtor 94098H6>
wd1: drive supports 16-sector PIO transfers, LBA addressing
wd1: 39083 MB, 79408 cyl, 16 head, 63 sec, 512 bytes/sect x 80043264 sectors
wd1: 32-bit data port
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd1(cmdide0:1:0): using PIO mode 4, DMA mode 2 (using DMA data transfers)
cd0(cmdide0:1:1): using PIO mode 4, DMA mode 2 (using DMA data transfers)
root on wd0a dumps on wd0b


Crash dumps are avaliable; however, the new GDB doesn't know how to
read an Alpha crash dump. I expect to fix that soon and will append
GDB-generated stack traces to this PR once that is done.


>How-To-Repeat:
	As described above.

>Fix:
	Unknown.
>Release-Note:
>Audit-Trail:
>Unformatted: