tech-kern: Re: pool problems, TAILQ, and more...

Subject: Re: pool problems, TAILQ, and more...
To: Chuck Silvers <chuq@chuq.com>
From: Brian C. Grayson <bgrayson@orac.ece.utexas.edu>
List: tech-kern
Date: 04/02/2000 22:49:45
On Thu, Mar 30, 2000 at 10:13:27PM -0800, Chuck Silvers wrote:
> ok, try the attached patch and let me know if you find anything.

  I got panics from two different points now, with Chuck's
fault-if-touching-freed-memory patches:

uvm_fault(0xc0639d80, 0xc090c000, 0, 1) -> 5
(gdb) #0  0x0 in ?? ()
#1  0xc07d9000 in ?? ()
#2  0xc03f8809 in cpu_reboot ()
#3  0xc01c325b in db_sync_cmd ()
#4  0xc01c2eab in db_command ()
#5  0xc01c303f in db_command_loop ()
#6  0xc01c6432 in db_trap ()
#7  0xc03f5240 in kdb_trap ()
#8  0xc0404080 in trap ()
#9  0xc0100eed in calltrap ()
#10 0xc041c615 in wddone (v=0xc0892800) at ../../../../dev/ata/wd.c:634
#11 0xc0420a88 in wdc_ata_bio_done ()
#12 0xc0420889 in wdc_ata_bio_intr ()
#13 0xc019b871 in wdcintr (arg=0xc0882d20) at ../../../../dev/ic/wdc.c:675
#14 0xc0450851 in pciide_compat_intr ()
#15 0xc0101cd4 in Xintr15 ()

uvm_fault(0xc0639c40, 0xc090b000, 0, 3) -> 5
(gdb) #0  0x0 in ?? ()
#1  0xc07d9000 in ?? ()
#2  0xc03f877d in cpu_reboot ()
#3  0xc01c325b in db_sync_cmd ()
#4  0xc01c2eab in db_command ()
#5  0xc01c303f in db_command_loop ()
#6  0xc01c6432 in db_trap ()
#7  0xc03f51b4 in kdb_trap ()
#8  0xc0403ff4 in trap ()
#9  0xc0100eed in calltrap ()
#10 0xc03e58a2 in sw_reg_strategy (sdp=0xc08acc00, bp=0xc08b3000, bn=0)
    at ../../../../uvm/uvm_swap.c:1371
#11 0xc03e4bf9 in swstrategy (bp=0xc08b3000) at ../../../../uvm/uvm_swap.c:1196
#12 0xc0238e57 in spec_strategy ()
#13 0xc03e96cd in VOP_STRATEGY (bp=0xc08b3000)
    at ../../../../sys/vnode_if.h:1124
#14 0xc03e83e1 in uvm_swap_io (pps=0xc4664da0, startslot=1, npages=16, flags=4)
    at ../../../../uvm/uvm_swap.c:1863
#15 0xc03e7934 in uvm_swap_put (swslot=1, ppsp=0xc4664da0, npages=16, flags=0)
    at ../../../../uvm/uvm_swap.c:1704
#16 0xc03dbe96 in uvm_pager_put (uobj=0x0, pg=0xc07e2408, ppsp_ptr=0xc4664ea8,
    npages=0xc4664ea4, flags=144, start=1, stop=0)
	at ../../../../uvm/uvm_pager.c:506
#17 0xc03de6a8 in uvmpd_scan_inactive (pglst=0xc066dc18)
    at ../../../../uvm/uvm_pdaemon.c:737
#18 0xc03def63 in uvmpd_scan () at ../../../../uvm/uvm_pdaemon.c:998
#19 0xc03ddb79 in uvm_pageout () at ../../../../uvm/uvm_pdaemon.c:295
#20 0xc01e1df8 in start_pagedaemon ()
#21 0xc010032b in proc_trampoline ()
#22 0xe000ffe7 in ?? ()
#23 0xfc064cf in ?? ()

  To my untrained eyes, with the assistance of a zillion printf's:

  It appears that we enter sw_reg_strategy(), and start a
multi-resid operation.  We do a splbio(), and start the first
resid's work.  Then, we allocate for the next resid.  Before we
finish the allocation, the block io finishes for the first
resid.  When we return from the interrupt handler, the _latest_
resid accesses cause a segfault.

  One thing I don't understand:  how come we do an splbio() in
the loop?  It seems like if there are multiple resids, we'll do
the splbio() multiple times, thus leaving s with the wrong level
the last time through, and thus not restoring the right level in
the splx(s) outside the loop.

  Any more ideas?  :)

  Brian Grayson