tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: USB printing panic



Hello Eduardo,

All of the below should be taken with the caveat that I'm an amateur at best and I've never looked at the NetBSD kernel (or that of any other OS, for that matter) before. So it's likely to be completely wrong, or worse.

On 2/9/2011 4:08 PM, Eduardo Horvath wrote:
On Wed, 9 Feb 2011, Bill Green wrote:

>> cpu0: data fault: pc=14becc8 addr=0
>> kernel trap 30: data access exception
>> Stopped in pid 0.5 (system) at  netbsd:usbd_setup_xfer+0x8:     ldub
>> [
>> %o0 + 0x70], %g3
>
> This one is definitely a NULL pointer dereference in the kernel, probably
> in usbd_setup_xfer.

In usbdi.c there are several functions (usbd_setup_xfer, usbd_transfer, others) which take pointers to structures and don't check whether they are null before using them.

In ulpt.c, ulpt_tick calls usbd_setup_xfer and usbd_transfer, passing them a usb_xfer_handle contained in the struct ulpt_softc it gets a pointer to as argument.

The following appears to be happening in my case: after rastertoqpdl
crashes, the usb transfer is never finished (from the perspective of the printer, which will eventually print a sheet with a timeout error). ulptclose is called, which sets sc.sc_out_xfer (that eventually gets passed to usbd_setup_xfer and friends) to NULL, but leaves set sc (the struct ulpt_softc that ulpt_tick uses).

ulpt_tick sometimes (I haven't found where) gets called after ulptclose, and only checks whether sc is null, and NOT sc->sc_out_xfer.

I've added a test to ulpt_tick to check if sc->sc_out_xfer is null, and haven't been able to panic the system since. But I'm not sure whether anything else is making calls to the usbd_* functions with similar possible problems, or what the best way to fix this would be. Perhaps one could set the struct ulpt_softc itself to NULL in ulptclose, if other functions in ulpt.c follow the same assumptions? But, as I mentioned, there seem to be a lot of functions in usbdi.c that assume they are getting usable pointers, and these functions get used in a lot of other drivers besides the ulpt code.


panic: kernel fault
Stopped in pid 0.5 (system) at  netbsd:cpu_Debugger+0x4:        nop
db>  bt
data_access_fault(b5cbaa0, 30, 1476388, 0, 70, 400) at

Definitely a kernel problem but don't know the specifics.  You need to
dump the trapframe.

I think this is the same bug detailed above. I'm not exactly sure what you mean by needing to dump the trapframe unless it is what I've provided below.

#0  dumpsys () at ../../../../arch/sparc64/sparc64/machdep.c:755
#1  0x00000000014abeb8 in cpu_reboot (howto=256, user_boot_string=0x0)
    at ../../../../arch/sparc64/sparc64/machdep.c:623
#2 0x00000000010c7a28 in db_sync_cmd (addr=190633464, have_addr=false, count=-1, modif=0xb5cd4d8 "")
    at ../../../../ddb/db_command.c:1304
#3 0x00000000010c821c in db_command (last_cmdp=0x180f678) at ../../../../ddb/db_command.c:926 #4 0x00000000010c8514 in db_command_loop () at ../../../../ddb/db_command.c:583 #5 0x00000000010cbc90 in db_trap (type=<value optimized out>, code=0) at ../../../../ddb/db_trap.c:101 #6 0x00000000014bbc6c in kdb_trap (type=48, tf=0xb5cd9e0) at ../../../../arch/sparc64/sparc64/db_interface.c:498 #7 0x00000000014b8604 in data_access_fault (tf=0xb5cd9e0, type=48, pc=21757420, addr=0, sfva=0, sfsr=8390665)
    at ../../../../arch/sparc64/sparc64/trap.c:1200
#8  0x0000000001008b24 in Ldatafault_internal ()
#9  0x0000000001008b24 in Ldatafault_internal ()
Previous frame identical to this frame (corrupt stack?)

(gdb) bt full
[...]
#7 0x00000000014b8604 in data_access_fault (tf=0xb5cd9e0, type=48, pc=21757420, addr=0, sfva=0, sfsr=8390665)
    at ../../../../arch/sparc64/sparc64/trap.c:1200
        l = (struct lwp *) 0xb30ef80
        p = (struct proc *) 0x1823c98
        vm = (struct vmspace *) 0xe0018000
        va = 0
        rv = 0
        access_type = 1
        onfault = 0
        sticks = 128
ksi = {ksi_flags = 1, ksi_list = {cqe_next = 0xb5cd131, cqe_prev = 0x108360c}, ksi_info = {_signo = 0,
---Type <return> to continue, or q <return> to quit---
_code = 16352, _errno = 510, _pad = 33555456, _reason = {_rt = {_pid = 0, _uid = 16360, _value = {sival_int = 0, sival_ptr = 0x0}}, _child = {_pid = 0, _uid = 16360, _status = 0, _utime = 0, _stime = 0}, _fault = { _addr = 0x3fe8, _trap = 0}, _poll = {_band = 16360, _fd = 0}}}, ksi_lid = 0}
        lastdouble = 0
[...]

(gdb) frame 7
#7 0x00000000014b8604 in data_access_fault (tf=0xb5cd9e0, type=48, pc=21757420, addr=0, sfva=0, sfsr=8390665)
    at ../../../../arch/sparc64/sparc64/trap.c:1200
1200                                    DEBUGGER(type, tf);
(gdb) print *tf
$1 = {tf_tstate = 17666409988, tf_pc = 21757420, tf_npc = 21757424, tf_fault = 0, tf_kstack = 0, tf_y = 0, tf_tt = 48, tf_pil = 0 '\0', tf_oldpil = 0 '\0', tf_global = {0, 4294967296, 29442048, 0, 1, 29442048, 2504691800080896, 387520}, tf_out = {25980824, 5, 0, 25980824, 190635096, 0, 190632721, 20291036}, tf_local = {0, 3758194688, -1, 0, 3758194688, -1, 10, 0}, tf_in = {6, 0, 191390672, 208871424, 8192, 0, 190632913, 21837236}}
(gdb)







Home | Main Index | Thread Index | Old Index