Subject: Re: mapped device panic (fix)
To: None <port-sun3@NetBSD.ORG>
From: Greg A. Woods <woods@most.weird.com>
List: port-sun3
Date: 11/20/1997 02:52:03
[ On Wed, November 19, 1997 at 20:33:24 (GMT), Gordon W. Ross wrote: ]
> Subject: mapped device panic (fix)
>
> Here is the final fix (I hope).  This is what will appear in
> the release, as soon as the releng people do their thing.

I downloaded the Nov 19 sys.tar.gz from netbsd.org (since that's all the
space I've got available) and applied the pmap.88 patch Gordon posted.
(Kernel build on MOUSETRAP, a 2-spindle 3/260 w/32MB:  2h34m)

I booted the new kernel on my diskless workstation and all seemed fine.

The kernel no longer panics when XsunMono starts (and it seems to work
A-OK in fact), but I can't seem to run some useful binaries, such as
xterm.  I got a complaint about an unknown library from ld.so, and sure
enough when I ran 'ldd' there was an "extra" entry appended to the end
which has random garbage for a name.  Then I did some more checking and
it seemed quite a few X binaries had this problem, but none in /usr/bin
did.

Initially re-running ldconfig seemed to make no difference so I checked
the binaries from the server (still running the older 11/17 kernel), and
they seemed fine from there.

Then suddenly they seemed fine on the new kernel.  But not for real.
Trying to run startx again left me with an xterm.core.  Now I get a
'xterm: signal 11' from ldd and even another xterm.core!

There seems to be some serious bitrot in there somewhere.

Unfortunately this is not a "GENERIC" kernel and given the hour of day
in this timezone I won't be able to test a GENERIC until tomorrow....

Then just for fun I tried running xdm, found my read-only /usr mount
caused it to fail and so I rebooted with a read-write /usr.  This time
the ldd on xterm seemed OK so I ran startx again only to have it give me
the MMU fault trap again:  (no savecore from the worksation, sorry!)

Nov 20 02:38:30 very.weird.com /netbsd: vm_fault(0xe140000, 0x0, 0x3, 0) -> 0x2
Nov 20 02:38:30 very.weird.com /netbsd: trap type=0x8, code=0x105, v=0x34
Nov 20 02:38:30 very.weird.com /netbsd: kernel: MMU fault trap
Nov 20 02:38:31 very.weird.com /netbsd: pid = 2, pc = 0E027C5E, ps = 2410, sfc = 1, dfc = 1
Nov 20 02:38:31 very.weird.com /netbsd: Registers:
Nov 20 02:38:31 very.weird.com /netbsd:              0        1        2        3        4        5        6        7
Nov 20 02:38:31 very.weird.com /netbsd: dreg: 0000BEEF 00002204 0E62F960 00002204 0E62F900 00000000 00000003 0000000F
Nov 20 02:38:31 very.weird.com /netbsd: areg: 00000000 FFFFDEAD 0E62CB40 0E62D900 0E5A7C00 0EECDD30 0EECDC40 0DFFFFFC
Nov 20 02:38:31 very.weird.com /netbsd: 
Nov 20 02:38:32 very.weird.com /netbsd: Kernel stack (0EECDB40):
Nov 20 02:38:32 very.weird.com /netbsd: ECDB40: 0E0A9ECC 0EECDB94 00000080 0E62F960 00002204 0E62F900 00000000 00000003
Nov 20 02:38:32 very.weird.com /netbsd: ECDB60: 0000000F 0E62CB40 0E62D900 0E5A7C00 0EECDD30 00000003 00000000 00000000
Nov 20 02:38:32 very.weird.com /netbsd: ECDB80: 0EECDC40 0E0040E4 00000008 00000105 00000034 0000BEEF 00002204 0E62F960
Nov 20 02:38:32 very.weird.com /netbsd: ECDBA0: 00002204 0E62F900 00000000 00000003 0000000F 00000000 FFFFDEAD 0E62CB40
Nov 20 02:38:33 very.weird.com /netbsd: ECDBC0: 0E62D900 0E5A7C00 0EECDD30 0EECDC40 0DFFFFFC 00000000 24100E02 7C5EB008
Nov 20 02:38:33 very.weird.com /netbsd: ECDBE0: 3E280105 216A40C0 00000034 0E62CB74 00000007 206A0034 0E027C66 0E027C64
Nov 20 02:38:34 very.weird.com /netbsd: ECDC00: 0E027C62 00000007 0034FF0D 000F1486 00000007 00000000 00002400 00002400
Nov 20 02:38:34 very.weird.com /netbsd: ECDC20: 801E001F 0E62CB74 00000000 0000BEEF 00000000 0E62F960 00000000 0E62F600
Nov 20 02:38:34 very.weird.com /netbsd: ECDC40: 0EECDCBC 0E0722A0 0E62CB40 00000020 00002000 00000000 00000014 0023E000
Nov 20 02:38:34 very.weird.com /netbsd: ECDC60: 00002000 0EE44830 0E62FF00 0EECDDBC 0E0E5E02 00003473 E81E0000 00000000
Nov 20 02:38:34 very.weird.com /netbsd: ECDC80: 2034F480 00000000 00002204 0EEC0E62 F9640E62 F9003E59 77B30E62 F8800034
Nov 20 02:38:34 very.weird.com /netbsd: ECDCA0: 00000000 00180000 00000000 00000014 0023E000 00002000 0EE44830 0EECDD44
Nov 20 02:38:34 very.weird.com /netbsd: ECDCC0: 0E07DF12 0E5A4800 0E62FF00 00000007 0E57C000 0E5A5500 0EECDD38 0EECDD34
Nov 20 02:38:35 very.weird.com /netbsd: ECDCE0: 0EECDD30 0E5A5500 0E57C000 00000002 00000004 0023E000 00000009 0E5A5A84
Nov 20 02:38:35 very.weird.com /netbsd: ECDD00: 00000000 0EECDDB4 0EECDDBC 0E5A7C00 0EECDD34 0EECDD30 00000002 00000200
Nov 20 02:38:35 very.weird.com /netbsd: ECDD20: 00000000 0EECDD44 0E09C364 0E7776F0 0E62F964 0E62F900 0E62F900 0E62BB80
Nov 20 02:38:35 very.weird.com /netbsd: panic: MMU fault
Nov 20 02:38:35 very.weird.com /netbsd: syncing disks... 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 giving up
Nov 20 02:38:35 very.weird.com /netbsd: Kernel rebooting...
Nov 20 02:38:36 very.weird.com /netbsd: NetBSD 1.3_ALPHA (MOUSETRAP) #0: Thu Nov 20 01:27:27 EST 1997
Nov 20 02:38:36 very.weird.com /netbsd:     woods@sometimes:/var/usr.src/sys/arch/sun3/compile/MOUSETRAP
Nov 20 02:38:36 very.weird.com /netbsd: Model: Sun 3/60 (hostid 0x1700d16f)
Nov 20 02:38:36 very.weird.com /netbsd: fpu: mc68881
Nov 20 02:38:36 very.weird.com /netbsd: real mem = 12582912
Nov 20 02:38:36 very.weird.com /netbsd: avail mem = 10436608
Nov 20 02:38:36 very.weird.com /netbsd: using 89 buffers containing 729088 bytes of memory
Nov 20 02:38:36 very.weird.com /netbsd: mainbus0 (root)
Nov 20 02:38:37 very.weird.com /netbsd: obio0 at mainbus0
Nov 20 02:38:37 very.weird.com /netbsd: zsc0 at obio0 addr 0x0 level 6: (softpri 3)
Nov 20 02:38:37 very.weird.com /netbsd: kbd0 at zsc0 channel 0 (console)
Nov 20 02:38:38 very.weird.com /netbsd: ms0 at zsc0 channel 1
Nov 20 02:38:38 very.weird.com /netbsd: zsc1 at obio0 addr 0x20000 level 6: (softpri 3)
Nov 20 02:38:38 very.weird.com /netbsd: zstty0 at zsc1 channel 0
Nov 20 02:38:38 very.weird.com /netbsd: zstty1 at zsc1 channel 1
Nov 20 02:38:38 very.weird.com /netbsd: zsc1: enabling zs interrupts
Nov 20 02:38:39 very.weird.com /netbsd: eeprom0 at obio0 addr 0x40000
Nov 20 02:38:39 very.weird.com /netbsd: clock0 at obio0 addr 0x60000 level 5
Nov 20 02:38:39 very.weird.com /netbsd: memerr0 at obio0 addr 0x80000 level 7: (Parity memory)
Nov 20 02:38:40 very.weird.com /netbsd: intreg0 at obio0 addr 0xa0000
Nov 20 02:38:40 very.weird.com /netbsd: le0 at obio0 addr 0x120000 level 3: address 08:00:20:06:7d:89
Nov 20 02:38:40 very.weird.com /netbsd: le0: 8 receive buffers, 2 transmit buffers
Nov 20 02:38:41 very.weird.com /netbsd: si0 at obio0 addr 0x140000 level 2: options=0xf
Nov 20 02:38:41 very.weird.com /netbsd: scsibus0 at si0: 8 targets
Nov 20 02:38:41 very.weird.com /netbsd: cd0 at scsibus0 targ 6 lun 0: <SONY, CD-ROM CDU-8012, 3.1a> SCSI2 5/cdrom removable
Nov 20 02:38:41 very.weird.com /netbsd: obmem0 at mainbus0
Nov 20 02:38:41 very.weird.com /netbsd: bwtwo0 at obmem0 addr 0xff000000 (1600x1280)
Nov 20 02:38:41 very.weird.com /netbsd: enabling interrupts
Nov 20 02:38:42 very.weird.com /netbsd: boot device: le0
Nov 20 02:38:42 very.weird.com /netbsd: nfs_boot: trying RARP (and RPC/bootparam)
Nov 20 02:38:42 very.weird.com /netbsd: nfs_boot: client_addr=0xcc5cfe03
Nov 20 02:38:42 very.weird.com /netbsd: nfs_boot: server_addr=0xcc5cfe06
Nov 20 02:38:43 very.weird.com /netbsd: nfs_boot: hostname=very.weird.com
Nov 20 02:38:44 very.weird.com /netbsd: nfs_boot: timeout...
Nov 20 02:38:44 very.weird.com last message repeated 2 times
Nov 20 02:38:44 very.weird.com /netbsd: root on sometimes:/export/root/very
Nov 20 02:38:45 very.weird.com /netbsd: root file system type: nfs


More weirdnesses:

I had wanted to try enabling all the SCSI options, so the workstation
being a 3/60 I set the si flags:

	si0 at obio0 addr   0x140000 level 2 flags 0x0

However as you can see above it booted with:

Nov 20 02:38:41 very.weird.com /netbsd: si0 at obio0 addr 0x140000 level 2: options=0xf

Now unless the driver printf uses a different interpretation of
"options" than the config file uses for "flags", it would seem the
config file is ignored.  Not that this matters on this system which has
only a CD-ROM on target 6.....

AH!  I think I see the bug.  It is impossible to set the options to 0:

si_obio_attach(parent, self, args)
        struct device   *parent, *self;
        void            *args;
{
        struct si_softc *sc = (struct si_softc *) self;
        struct ncr5380_softc *ncr_sc = &sc->ncr_sc;
        struct cfdata *cf = self->dv_cfdata;
        struct confargs *ca = args;

        /* Get options from config flags if specified. */
        if (cf->cf_flags)
                sc->sc_options = cf->cf_flags;
        else
                sc->sc_options = si_obio_options;

        printf(": options=0x%x\n", sc->sc_options);

Some innocuous bit must be set in order to not get the default.  How
about 0x8 (i.e. the host adapter)?

And a minor annoyance:

The console on bwtwo doesn't get initialized with the right screen size
in the tty driver.  In theory it should pull this from the eeprom, which
I've set to the correct value of 48, but it gets set to 34 instead.
I'll try and remember to send-pr this too.

Here are the diff's for my kernel.  I don't think I've done anything
exceptionally silly here, but you never know!  ;-)

In any case I'll build and test a GENERIC too....

02:28 [512] # diff GENERIC MOUSETRAP 
13c13
< maxusers      4
---
> maxusers      64
34,35c34,35
< #options      DIAGNOSTIC      # extra kernel sanity checking
< #options      KMEMSTATS       # kernel memory statistics (vmstat -m)
---
> options       DIAGNOSTIC      # extra kernel sanity checking
> options       KMEMSTATS       # kernel memory statistics (vmstat -m)
56c56
< file-system   MFS             # memory-based filesystem
---
> #file-system  MFS             # memory-based filesystem
64c64
< options       TCP_COMPAT_42   # compatibility with 4.2BSD TCP/IP
---
> #options      TCP_COMPAT_42   # compatibility with 4.2BSD TCP/IP
111,115c111,113
< # XXX: Disable disconnect/reselect on disks for now...
< # XXX: Disable DMA interrupts for now on the obio...
< si0 at obio0 addr   0x140000 level 2 flags 0x1000f
< si0 at vmes0 addr 0xff200000 level 2 vect 0x40 flags 0xf
< si1 at vmes0 addr 0xff204000 level 2 vect 0x41 flags 0xf
---
> si0 at obio0 addr   0x140000 level 2 flags 0x0
> si0 at vmes0 addr 0xff200000 level 2 vect 0x40 flags 0x0
> si1 at vmes0 addr 0xff204000 level 2 vect 0x41 flags 0x0
118,120c116,118
< xyc0 at vmes0 addr 0xffffee40 level 2 vect 0x48
< xyc1 at vmes0 addr 0xffffee48 level 2 vect 0x49
< xy* at xyc? drive ?
---
> #xyc0 at vmes0 addr 0xffffee40 level 2 vect 0x48
> #xyc1 at vmes0 addr 0xffffee48 level 2 vect 0x49
> #xy* at xyc? drive ?
123,125c121,123
< xdc0 at vmel0 addr 0xffffee80 level 2 vect 0x44
< xdc1 at vmel0 addr 0xffffee90 level 2 vect 0x45
< xd* at xdc? drive ?
---
> #xdc0 at vmel0 addr 0xffffee80 level 2 vect 0x44
> #xdc1 at vmel0 addr 0xffffee90 level 2 vect 0x45
> #xd* at xdc? drive ?
162,165c160,163
< sebuf0 at vmes0 addr 0xff300000 level 2 vect 0x74
< sebuf1 at vmes0 addr 0xff340000 level 2 vect 0x76
< si* at sebuf?
< ie* at sebuf?
---
> #sebuf0 at vmes0 addr 0xff300000 level 2 vect 0x74
> #sebuf1 at vmes0 addr 0xff340000 level 2 vect 0x76
> #si* at sebuf?
> #ie* at sebuf?
188c186
< #pseudo-device        ipfilter                # ip filter
---
> pseudo-device ipfilter                # ip filter
190,191c188,189
< pseudo-device pty             64      # pseudo-terminals
< #pseudo-device        vnd             4       # paging to files
---
> pseudo-device pty             128     # pseudo-terminals
> pseudo-device vnd             4       # paging to files


-- 
							Greg A. Woods

+1 416 443-1734      VE3TCP      <gwoods@acm.org>      <robohack!woods>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>