Subject: port-sun3/4720: sun3 diskless still has "X binaries dump core" when trying to run X
To: None <gnats-bugs@gnats.netbsd.org>
From: None <woods@sometimes.weird.com>
List: netbsd-bugs
Date: 12/18/1997 19:54:01
>Number:         4720
>Category:       port-sun3
>Synopsis:       sun3 diskless still has "X binaries dump core" when trying to run X
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    gnats-admin (GNATS administrator)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Dec 18 17:05:00 1997
>Last-Modified:
>Originator:     Greg A. Woods
>Organization:
Planix, Inc; Toronto, Ontario; Canada
>Release:        NetBSD-current 1997/12/15
>Environment:

System: NetBSD sometimes 1.3_BETA NetBSD 1.3_BETA (MOUSETRAP) #5: Wed Dec 18 18:40:12 EST 1997 woods@sometimes:/var/usr.src/sys/arch/sun3/compile/MOUSETRAP sun3
Userland: NetBSD 1.3_ALPHA sometime around the end of November

>Description:

Diskless sun3's which use NFS swap files still cannot run X11 for very
long -- usually not even long enough to get an xterm open.

Things start dumping core (normally only X binaries), and the following
message is printed by the kernel:

    /netbsd: vm_pager_unmap_pages: 0xe157b80(e168000/a96000) not owned

The numbers change but the error is repeatable and has occured with
every kernel since the most recent vm_swap and pmap fixes (without
which the diskless system is effectively useless).

Normally nothing seems to dump core after X has shut down though.  The
system has run for several days, including the nightly cron stuff, after
experiencing such an error with no apparent difficulty.

The X11 binaries I have are the most recent from ftp.netbsd.org (r6.3).

X11 runs OK on the diskfull machine now, though eventually under heavy
load the system gets itself tied in a knot and starts doing nasty things
including corrupting filesystems, and eventually crashes.  This seems to
happen regardless of whether X11 is running though.

On occasion X11 will actually succeed, starting the normal default three
xterms and an xclock with twm as window manager.  The most recent time
this happened was just now (I was hoping to get a core backtrace), and 
the following kernel messages were printed:

Dec 18 19:28:46 very.weird.com /netbsd: swpg_alloc: swpager malloc failed
Dec 18 19:28:48 very.weird.com last message repeated 2 times
Dec 18 19:28:48 very.weird.com /netbsd: nfs send error 55 for server sometimes:/export/swap/very

So, no core traceback this time, and my time is too short to try
reproducing the error just now....

>How-To-Repeat:

Boot a fresh system.
login as root
run startx
observe X11 shut down after vm_pager_unmap_pages message
note xterm.core (and sometimes more *.core's) in /root

Here's the kernel config, just in case:

# $NetBSD: GENERIC,v 1.42 1997/10/17 03:17:01 gwr Exp $

# GENERIC Sun3 (3/50, 3/60, 3/110, 3/160, 3/260)
# Supports root on: ie0, le0, sd*, ...

include "arch/sun3/conf/std.sun3"

# Machines to be supported by this kernel
options 	FPU_EMULATE
options 	HAVECACHE		# Sun3/260 VAC

# Needs to be set per system.  i.e change these as you see fit
maxusers	64

# Standard system options
options 	KTRACE		# system call tracing
options 	SYSVMSG		# System V message queues
options 	SYSVSEM		# System V semaphores
options 	SYSVSHM		# System V shared memory
#options 	SHMMAXPGS=1024	# 1024 pages is the default
#options 	LKM		# loadable kernel modules
#options 	INSECURE	# disable kernel security level
#options 	UCONSOLE	# Allow non-root TIOCCONS
options 	MAXUPRC=120	# defaults to CHILD_MAX (80)

# Which kernel debugger?  Uncomment either this:
options 	DDB
# ... or these for KGDB (gdb remote target)
#makeoptions DEBUG="-g"		# debugging symbols for gdb
#options 	KGDB
#options 	KGDBDEV=0x0C01	# ttya=0C00 ttyb=0C01

# Other debugging options
options 	DEBUG		# kernel debugging code
options 	DIAGNOSTIC	# extra kernel sanity checking
options 	KMEMSTATS	# kernel memory statistics (vmstat -m)
options 	PMAP_DEBUG
options 	SCSIDEBUG
options 	SCSIVERBOSE	# Verbose SCSI errors

# kernel printf message buffer size...
#options 	MSGBUFSIZE=integer

# Compatability options
options 	COMPAT_SUNOS	# can run SunOS 4.1.1 executables
options 	COMPAT_43	# and 4.3BSD and ...
options 	COMPAT_10	# NetBSD 1.0
options 	COMPAT_11	# NetBSD 1.1
options 	COMPAT_12	# NetBSD 1.2

# Filesystem options
file-system	FFS		# Berkeley Fast Filesystem
file-system	NFS		# Sun NFS client support
file-system	CD9660		# ISO 9660 + Rock Ridge file system
file-system	FDESC		# /dev/fd/*
file-system	KERNFS		# /kern
file-system	NULLFS		# loopback file system
file-system	PROCFS		# /proc
file-system	UNION		# union file system
#file-system	MFS		# memory-based filesystem

options 	FIFO		# FIFOs; RECOMMENDED
options 	NFSSERVER	# nfs server support
options 	QUOTA		# FFS quotas

# Networking options
options 	INET		# IP prototol stack support
#options 	TCP_COMPAT_42	# compatibility with 4.2BSD TCP/IP
options 	GATEWAY		# IP packet forwarding
#options 	IPFORWSRCRT=0	# IP forwarding of source routed packets
#options  	MROUTING	# multicast routing support (req INET)
#options 	ISO,TPIP	# OSI networking
#options 	EON		# OSI tunneling over IP
#options 	CCITT,LLC,HDLC	# X.25
#options 	NETATALKA	# Appletalk stack
options 	PFIL_HOOKS	# pfil(9) packet filter hooks.
#options 	PPP_FILTER	# pcap(3) PPP filter hooks.

# options for pseudo-device ipfilter
options 	IPFILTER_LOG	# ipfilter logging
#options 	IPFILTER_DEFAULT_BLOCK	# ipfilter starts blocked

# Work-around for root on slow servers (insurance...)
options 	NFS_BOOT_RWSIZE=1024

config		netbsd root on ? type ?

#
# Serial ports
#
zstty0	at zsc1 channel 0	# ttya
zstty1	at zsc1 channel 1	# ttyb

kbd0	at zsc0 channel 0	# keyboard
ms0	at zsc0 channel 1	# mouse

#
# Network devices
#

# Intel Ethernet (onboard, or VME)
ie0 at obio0 addr   0x0C0000 level 3
ie1 at vmes0 addr 0xffe88000 level 3 vect 0x75

# Lance Ethernet (only onboard)
le0 at obio0 addr   0x120000 level 3

#
# Disk and tape devices
#

# Sun3 "si" SCSI controller (NCR 5380)
# This driver has several flags which may be enabled using
# the "flags" directive.  Valid flags are:
#
# 0x000ff	Set (1<<target) to disable disconnect/reselect
# 0x0ff00	Set (1<<(target+8)) to disable parity checking
# 0x10000	Set this bit to disable DMA interrupts (poll)
# 0x20000	Set this bit to disable DMA entirely (use PIO)
#
# For example: "flags 0x1000f" would disable DMA interrupts,
# and disable disconnect/reselect for targets 0-3
# XXX HACK: use 0x80 to disable all flags as the default when
# XXX set to zero will result in 0xf being used instead!!!!
si0 at obio0 addr   0x140000 level 2 flags 0x80
si0 at vmes0 addr 0xff200000 level 2 vect 0x40 flags 0x80
si1 at vmes0 addr 0xff204000 level 2 vect 0x41 flags 0x80

# Xylogics 450/451 controllers
#xyc0 at vmes0 addr 0xffffee40 level 2 vect 0x48
#xyc1 at vmes0 addr 0xffffee48 level 2 vect 0x49
#xy* at xyc? drive ?

# Xylogics 7053 controllers
#xdc0 at vmel0 addr 0xffffee80 level 2 vect 0x44
#xdc1 at vmel0 addr 0xffffee90 level 2 vect 0x45
#xd* at xdc? drive ?

# Xylogics 472 tape controllers?

#
# Frame buffer devices
#

# The default cgfour address depends on the machine:
# 3/60: obmem 0xFF200000 .. 0xFF9fffff
# 3/110: different? (not tested)
cgfour0 at obmem0 addr ?

# 3/60 P4 accelerated 8-bit color frame buffer
# XXX NOTICE: not yet implemented
#cgsix0 at obmem0 addr ?

# 3/60 P4 24-bit color frame buffer
# cgeight0 at obmem0 addr ?

# The default bwtwo address depends on the machine:
# 3/50: obmem   0x100000
# else: obmem 0xff000000
bwtwo0 at obmem0 addr ?
# 3/60 P4 color frame buffer overlay plane, or P4 monochrome frame buffer
bwtwo1 at obmem0 addr 0xff300000
# 3/60 plug-in color frame buffer overlay plane
bwtwo1 at obmem0 addr 0xff400000

# Sun-3 color board, or CG5 8-bit VME frame buffer.
cgtwo0 at vmes0 addr 0xff400000 level 4 vect 0xA8

# Support for the CG9 24-bit VME frame buffer.
# cgnine0 at vmel0 addr 0x08000000

#
# Sun3/E stuff
#
#sebuf0 at vmes0 addr 0xff300000 level 2 vect 0x74
#sebuf1 at vmes0 addr 0xff340000 level 2 vect 0x76
#si* at sebuf?
#ie* at sebuf?

#
# SCSI infrastructure
#
scsibus* at scsi?

sd* at scsibus? target ? lun ?		# SCSI disks
st* at scsibus? target ? lun ?		# SCSI tapes
cd* at scsibus? target ? lun ?		# SCSI CD-ROMs
ch* at scsibus? target ? lun ?		# SCSI changer devices
ss* at scsibus? target ? lun ?		# SCSI scanners
uk* at scsibus? target ? lun ?		# unknown SCSI devices

# Memory-disk drivers
pseudo-device	md		2

# Misc.
pseudo-device	loop		1	# network loopback
pseudo-device	bpfilter	8	# packet filter
pseudo-device	sl		2	# CSLIP
pseudo-device	ppp		2	# PPP
pseudo-device	tun		2	# network tunneling over tty
pseudo-device	ipfilter		# ip filter

pseudo-device	pty		128	# pseudo-terminals
pseudo-device	vnd		4	# paging to files
#pseudo-device	ccd		4	# concatenated disks

----------------
and here are the startup messages logged:

Dec 18 19:03:38 very.weird.com /netbsd: Kernel rebooting...
Dec 18 19:03:39 very.weird.com /netbsd: NetBSD 1.3_BETA (MOUSETRAP) #5: Thu Dec 18 18:40:12 EST 1997
Dec 18 19:03:39 very.weird.com /netbsd:     woods@sometimes:/var/usr.src/sys/arch/sun3/compile/MOUSETRAP
Dec 18 19:03:39 very.weird.com /netbsd: Model: Sun 3/60 (hostid 1700d16f)
Dec 18 19:03:39 very.weird.com /netbsd: fpu: mc68881
Dec 18 19:03:39 very.weird.com /netbsd: real  mem = 12288K (0xc00000)
Dec 18 19:03:39 very.weird.com /netbsd: avail mem = 10152K (0x9ea000)
Dec 18 19:03:45 very.weird.com /netbsd: using 89 buffers containing 729088 bytes of memory
Dec 18 19:03:45 very.weird.com /netbsd: mainbus0 (root)
Dec 18 19:03:45 very.weird.com /netbsd: obio0 at mainbus0
Dec 18 19:03:45 very.weird.com /netbsd: zsc0 at obio0 addr 0x0 level 6: (softpri 3)
Dec 18 19:03:45 very.weird.com /netbsd: kbd0 at zsc0 channel 0 (console)
Dec 18 19:03:45 very.weird.com /netbsd: ms0 at zsc0 channel 1
Dec 18 19:03:45 very.weird.com /netbsd: zsc1 at obio0 addr 0x20000 level 6: (softpri 3)
Dec 18 19:03:45 very.weird.com /netbsd: zstty0 at zsc1 channel 0
Dec 18 19:03:45 very.weird.com /netbsd: zstty1 at zsc1 channel 1
Dec 18 19:03:46 very.weird.com /netbsd: zsc1: enabling zs interrupts
Dec 18 19:03:46 very.weird.com /netbsd: eeprom0 at obio0 addr 0x40000
Dec 18 19:03:46 very.weird.com /netbsd: clock0 at obio0 addr 0x60000 level 5
Dec 18 19:03:46 very.weird.com /netbsd: memerr0 at obio0 addr 0x80000 level 7: (Parity memory)
Dec 18 19:03:46 very.weird.com /netbsd: intreg0 at obio0 addr 0xa0000
Dec 18 19:03:46 very.weird.com /netbsd: le0 at obio0 addr 0x120000 level 3: address 08:00:20:06:7d:89
Dec 18 19:03:46 very.weird.com /netbsd: le0: 8 receive buffers, 2 transmit buffers
Dec 18 19:03:47 very.weird.com /netbsd: si0 at obio0 addr 0x140000 level 2: options=0x80
Dec 18 19:03:47 very.weird.com /netbsd: scsibus0 at si0: 8 targets
Dec 18 19:03:47 very.weird.com /netbsd: scsipi_inqmatch: 26/5/1 <SONY    , CD-ROM CDU-8012 , >
Dec 18 19:03:47 very.weird.com /netbsd: scsipi_inqmatch: 2/5/1 <, , >
Dec 18 19:03:47 very.weird.com /netbsd: cd0 at scsibus0 targ 6 lun 0: <SONY, CD-ROM CDU-8012, 3.1a> SCSI2 5/cdrom removable
Dec 18 19:03:47 very.weird.com /netbsd: obmem0 at mainbus0
Dec 18 19:03:47 very.weird.com /netbsd: bwtwo0 at obmem0 addr 0xff000000 (1600x1280)
Dec 18 19:03:47 very.weird.com /netbsd: enabling interrupts
Dec 18 19:03:48 very.weird.com /netbsd: boot device: le0
Dec 18 19:03:48 very.weird.com /netbsd: mountroot: trying ffs...
Dec 18 19:03:48 very.weird.com /netbsd: mountroot: trying nfs...
Dec 18 19:03:48 very.weird.com /netbsd: nfs_boot: trying RARP (and RPC/bootparam)
Dec 18 19:03:48 very.weird.com /netbsd: nfs_boot: client_addr=0xcc5cfe03
Dec 18 19:03:48 very.weird.com /netbsd: nfs_boot: server_addr=0xcc5cfe06
Dec 18 19:03:48 very.weird.com /netbsd: nfs_boot: hostname=very.weird.com
Dec 18 19:03:48 very.weird.com /netbsd: root on sometimes:/export/root/very
Dec 18 19:03:49 very.weird.com /netbsd: root time: 0x347e06e8
Dec 18 19:03:49 very.weird.com /netbsd: WARNING: clock gained 21 days -- CHECK AND RESET THE DATE!
Dec 18 19:03:49 very.weird.com /netbsd: root file system type: nfs
Dec 18 19:03:49 very.weird.com /netbsd: init: copying out path `/sbin/init' 11

swap is a 32MB fully allocated file exported by NFS and mounted with the
following entry from /etc/fstab:

	sometimes:/export/swap/very none swap sw,nfsmntpt=/swap

>Fix:

unknown
>Audit-Trail:
>Unformatted: