Subject: port-i386/502: hard hang overnight while recompiling world.
To: None <gnats-admin@sun-lamp.cs.berkeley.edu>
From: None <dan@anarres.mame.mu.oz.au>
List: netbsd-bugs
Date: 09/28/1994 16:50:05
>Number:         502
>Category:       port-i386
>Synopsis:       hard hang overnight while building world
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    gnats-admin (GNATS administrator)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Sep 28 16:50:05 1994
>Originator:     Daniel Carosone
>Organization:
"	"
>Release:        
>Environment:
	
System: NetBSD blah 1.0_BETA NetBSD 1.0_BETA (_blah_) #1: Wed Sep 28 07:16:10 PDT 1994 dan@blah:/home/c/l/NetBSD/src/sys/arch/i386/compile/_blah_ i386

Hardware: i486DX-33, ISA, 4Mb RAM, generic io controller, 2x Quantum
80Mb IDE disks, SMC Elite-16 enet, trident svga.

This is a fresh installation yesterday from the disk sets made
available by Brian Moore <ziff@eecs.umich.edu> with a recompiled
kernel by me.

Filesystem      1K-blocks     Used    Avail Capacity  Mounted on
/dev/wd0a           11549     8322     2649    76%    /
/dev/wd0e           50754    38923     9293    81%    /usr
mfs:16              11719        1    11132     0%    /tmp
fdesc                   1        1        0   100%    /dev
procfs                  4        4        0   100%    /proc
kernfs                  1        1        0   100%    /kern
/dev/wd1c           79568        1    75588     0%    /home/e
anarres:/home/a    214137   184899    18179    91%    /home/a
oink:/home/c       196119    36845   139662    21%    /home/c
oink:/home/f       406854   236552   149959    61%    /home/f

/usr/src is a symlink into /home/f. oink is a sparc 1+ running
netbsd, anarres is linux. sources are from about a week ago, oink
is built from those. swap space on /dev/wd0b is ~17Mb.

Kernel config:

# architecture type and name of kernel; REQUIRED
machine		"i386"
ident		BLAH

# different CPU types; you must have at least the correct one; REQUIRED
cpu		"I486_CPU"

# floating point emulation
# options		MATH_EMULATE

# make the kernel a little faster; will break on some machines
#options	DUMMY_NOPS

# temporary kluge while adding support for non-contiguous physical memory
options		MACHINE_NONCONTIG

# time zone RTC is expected to be set in; REQUIRED
timezone	0 dst

# estimated number of users
maxusers	6

# paging of processes, and caching vnodes and devices; REQUIRED
options		SWAPPAGER
options		VNODEPAGER,DEVPAGER

# system call tracing, a la ktrace(1)
options		KTRACE

# FIFOs; RECOMMENDED
options		FIFO

# System V-like message queues
#options	SYSVMSG

# System V-like semaphores
#options	SYSVSEM

# System V-like memory sharing
#options	SYSVSHM
#options	SHMMAXPGS=1024		# 1024 pages is the default

# generic SCSI system
#options		SCSI

# UFS
options		FFS

# quotas in UFS
#options		QUOTA

# memory file system (shares memory and swap space)
options		MFS

# Sun's Network File System
options		NFSSERVER
options		NFSCLIENT

# ISO 9660 file system, with Rock Ridge
#options		"CD9660"

# MS-DOS file system
#options		MSDOSFS

# /dev/fd
options		FDESC

# kernel file system
options		KERNFS

# process file system
options		PROCFS

# various types of networks and protocols
#options	IMP	 
options		INET
#options		NS
#options		ISO,TPIP,EON
#options		CCITT,LLC,HDLC

# packet forwarding
#options		GATEWAY

# kernel debugger
#options		DDB

# Allows user to create an i386 LDT (Used by Wine to run Windows programs)
#options		"USER_LDT"

# NetBSD 0.8 and 0.9 compatibility
options		"COMPAT_NOMID"
options		"COMPAT_09"

options		"COMPAT_43"
options		"TCP_COMPAT_42"

config		netbsd	root on wd0 swap on wd0 and sd0

#buses
controller	isa0

#console
device		pc0	at isa? port "IO_KBD" irq 1

#serial ports
device		com0	at isa? port "IO_COM1" irq 4
device		com1	at isa? port "IO_COM2" irq 3
#device		com2	at isa? port "IO_COM3" irq 5
#device		com3	at isa? port "IO_COM4" irq 9

#parallel ports
device		lpt0	at isa? port "IO_LPT1" irq 7
device		lpt1	at isa? port "IO_LPT2"
device		lpt2	at isa? port "IO_LPT3"

#non-scsi disk controllers
controller	wdc0	at isa? port "IO_WD1" irq 14
disk		wd0	at wdc0 drive ?
disk		wd1	at wdc0 drive ?

#non-scsi floppy controllers
controller	fdc0	at isa? port "IO_FD1" irq 6 drq 2
disk		fd0	at fdc0 drive ?
disk		fd1	at fdc0 drive ?

#ethernet
device ed0 at isa? port 0x300 irq 10 iomem 0xcc000

#math co-processor
device		npx0	at isa? port "IO_NPX" irq 13

# psuedo-terminals; REQUIRED for remote logins and many other things
pseudo-device pty	16

# loopback; RECOMMENDED
pseudo-device loop

# ethernet; REQUIRED if using any ethernet device
pseudo-device ether #XXX

# used by kernel for logging messages; gateway to syslogd
pseudo-device log

# packet filter
pseudo-device bpfilter	2

# compressed SLIP
#pseudo-device sl

# point-to-point protocol
#pseudo-device ppp

# vn virtual filesystem device
pseudo-device vn 2

# speaker queue
pseudo-device speaker

# tablet line discipline
#pseudo-device tb

#pseudo-device tun	missing header files

# /dev/audio
#pseudo-device audio


>Description:

After successfully rebuilding the kernel, I decided to leave the
machine compiling the userland overnight. This morning, the build
window was stuck building getpwent.so (ie.. not too far into the
build process). The build was being done in an rsh from anarres,
as I only have the one monitor for them. I swapped cables, and
there was nothing unusual on the console, but the machine was stuck
hard. After a reboot, fsck didn't complain about much, an unref
file or two and the free block count.

the sparc was also doing a rebuild at the same time, it's still
going happily.

I doubt it's a hardware problem, prior to yesterday the machine
had been running linux flawlessly.

the only other thing that looks unusual is during boot, I get a
couple of messages about pc0 errors, the first comes right after
the `C' in the Copyright message, and says something about timeout
setting leds, the second is captured below. (oh, the timezone is
still wrong :-)

Sep 28 14:44:07 blah /netbsd: NetBSD 1.0_BETA (_blah_) #1: Wed Sep 28 07:16:10 PDT 1994
Sep 28 14:44:08 blah /netbsd:     dan@blah:/home/c/l/NetBSD/src/sys/arch/i386/compile/_blah_
Sep 28 14:44:08 blah /netbsd: CPU: i486DX (486-class CPU)
Sep 28 14:44:08 blah /netbsd: real mem  = 3801088
Sep 28 14:44:08 blah /netbsd: avail mem = 2699264
Sep 28 14:44:08 blah /netbsd: using 72 buffers containing 294912 bytes of memory
Sep 28 14:44:09 blah /netbsd: pcprobe: reset error 3
Sep 28 14:44:10 blah /netbsd: pc0 at isa0 port 0x60-0x6f irq 1: color
Sep 28 14:44:10 blah /netbsd: com0 at isa0 port 0x3f8-0x3ff irq 4: ns82450 or ns16450, no fifo
Sep 28 14:44:10 blah /netbsd: com1 at isa0 port 0x2f8-0x2ff irq 3: ns82450 or ns16450, no fifo
Sep 28 14:44:11 blah /netbsd: lpt0 at isa0 port 0x378-0x37f irq 7
Sep 28 14:44:11 blah /netbsd: wdc0 at isa0 port 0x1f0-0x1f7 irq 14
Sep 28 14:44:11 blah /netbsd: wd0 at wdc0 drive 0: 80MB 965 cyl, 10 head, 17 sec <QUANTUM P80A 980-80-94xx>
Sep 28 14:44:11 blah /netbsd: wd1 at wdc0 drive 1: (unknown size) <Unknown Type>
                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(dunno why it says this, the disks are identical. Linux had the same result.)

Sep 28 14:44:12 blah /netbsd: fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
Sep 28 14:44:12 blah /netbsd: fd0 at fdc0 drive 0: 1.2MB 80 cyl, 2 head, 15 sec
Sep 28 14:44:12 blah /netbsd: ed0 at isa0 port 0x300-0x31f iomem 0xcc000-0xcffff irq 10: address 00:00:c0:0a:85:5d, type WD8013EPC (16-bit) bnc
Sep 28 14:44:12 blah /netbsd: npx0 at isa0 port 0xf0-0xff: using exception 16
Sep 28 14:44:13 blah /netbsd: biomask 4040 netmask 400 ttymask 1a
Sep 28 14:44:07 blah savecore: no core dump
Sep 28 14:44:19 blah lpd[97]: restarted
Sep 28 14:44:25 blah init: kernel security level changed from 0 to 1

>How-To-Repeat:

I'm going to restart building and see if it happens again. Is there
anything useful I can *do* with the machine if/when it hangs to
get more information?

>Fix:
>Audit-Trail:
>Unformatted: