Subject: NetBSD macppc 3.0 - large bus_dmamem_alloc() , eventual panic
To: None <port-macppc@netbsd.org>
From: Guy Erb <guyerb@mac.com>
List: port-macppc
Date: 06/07/2006 16:37:40
Hi

I am developing a driver for a CardBus network device to be used for
diagnostic purposes. The driver and associated user-land application
work correctly on a NetBSD 3.0 i386 installation but am having issue
on powerpc (15" PowerBook G4 667MHz with DVI, 1 GB RAM)

During attach(), the driver allocates a large 2MB buffer via
bus_dmamem_alloc(). The user-land application will get a
mmap() of this buffer and can then setup tx/rx packets and descriptor
rings for testing.

On macppc, the attach() routine is successful but there is generally a
panic (but not always) soon after as booting continues.

If the machine doesn't panic during boot (perhaps one in ten boots),
then it will panic later at seeminlgy random points while the
user-land application is busy banging away at this memory buffer.

The allocation of this large buffer follows the pattern I see in other
platform independant drivers, i.e.,

    bus_dmamem_alloc()
    bus_dmamem_map()
    bus_dmamap_create()
    bus_dmamap_load()

I believe I have narrowed it down to the succcessful
bus_dmamem_alloc() of the 2MB buffer as being sufficient to
destabilize the system.

        if ((error = bus_dmamem_alloc(sc->sc_dmat, sc->sc_dlen, PAGE_SIZE,
                             0, &sc->sc_dseg, 1, &sc->sc_dnseg, 0)) != 0) {
                printf("%s: unable to allocate control data, error = %d",
                       sc->sc_dev->dv_xname, error);
                goto fail0;
        }

Whether the panic happens at boot or runtime, the back trace will
generally look something like the following.

panic: trap
Stopped at netbsd:cpu_Debugger+0x10:  lwz      r0, r1, 0x14
db> t
0x0065dcd0:     at  panic+0x19c
0x0065dd60:     at  trap+0xe8
0x0065dde0:     kernel MCHK trap by splraise+0xc: srr1=0x149030
                r1=0x65dea0 cr=0x20009032 xer=0 ctr=0x3c02f4
0x0065dea0:     at ADBDevTable+0xffa1d4b4
0x0065deb0:     at callout_schedule+0x68
0x0065dee0:     at pffasttimeo+0x9c
0x0065df00:     at softclock+0x248
0x0065df40:     at hardclock+0x314
0x0065df70:     at decr_intr+0xfc
0x0065dfa0:     at trapstart+0xbd8
0xd6457e00:     at Idle+0x24
0xd6457e10:     at mi_switch+0x194
0xd6457e40:     at ltsleep+0x3fc
0xd6457e80:     at sys_nanosleep+0x148
0xd6457ed0:     at syscall_plain+0xe0
0xd6457f40:     user SC trap #240 by 0xefd7ddc: srr1=0xf032
                r1=0xffffcfc0 cr=0x24000028 xer=0x20000000 ctr=0xefdbd5bc


Thanks for your consideration.


regards
Guy