Current-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Filesystem tests crashing host
On Sat, Apr 16, 2011 at 11:54:26AM -0700, Paul Goyette wrote:
> (Resending, this time with subject line and cc's)
>
> >> I still say revert rmind's changes of 2011.04.11.22.31.43, because
> >> that's when the failures started. My logs show six test runs > between
> >christos' change to kern_descrip.c (at 2011.04.10.15.45.33) > and rmind's
> >changes, and none of those test runs paniced; after > rmind's changes,
> >every single test run has paniced.
> >
> >Problem is not diagnosed. It cannot be reproduced on real hardware,
> >and I do not see how f_ops can become invalid when using semaphore.
> >Even if we assume that it can - the semaphore code should actually be
> >*used* in the first place. However, it seems that neither failing
> >ATF tests, nor ATF itself are using semaphores. Can somebody prove
> >me wrong on this?
> >
> >Perhaps a simple printf("f_type = %d\n", fp->f_type) would hint what
> >type of descriptor is actually failing. Also, a wild guess - can one
> >reproduce the problem with the following changes reverted:
>
> I'm working on building kernels with each commit backed out - it will take
> a while.
>
> However, I have been able to dump the file structure:
>
> f_offset 0000 0000 0000 0000
> f_cred ffff 8000 098e db40
> f_ops ffff ffff 80ef fac0
> f_data ffff 8000 09de 28c0
> f_list.next ffff 8000 0ab7 f4c0
> f_list.prev ffff ffff 80cb 2728
> f_lock 0000 0000 0000 0000
> f_flag 0000 0003
> f_marker 0000 0000
> f_type 0000 0008
> f_advice 0000 0000
> f_count 0000 0000
> f_msgcount 0000 0000
> f_unpcount 0000 0000
> f_unplist.next 0000 0000 0000 0000
>
> Note that both the f_ops and f_list.prev pointers seem to be corrupt, and
> that the type of this structure is semaphore = 8
I added some instrumentation too. With a printf from ksem_sysinit(),
ksem_sysfini() and do_ksem_init()/do_ksem_open() I get:
fs/nfs/t_mountd (91/400): 1 test cases
mountdhup:
ksem_sysinit ops 0xcb0d4bc0
fp 0xcaf91780 ops 0xcb0d4bc0
ksem_sysfini ops 0xcb0d4bc0
uvm_fault(0xc0b0f380, 0xcb0d4000, 1) -> 0xe
Backtrace:
closef(caf91780, ...) <== the file with the ksem above!
fd_free()
exit1()
sigexit()
postsig()
lwp_userret()
syscall()
The fault address is the page containing the ops vector of the now unloaded
ksem module -- page fault -- boom.
--
Juergen Hannken-Illjes - hannken%eis.cs.tu-bs.de@localhost - TU Braunschweig
(Germany)
Home |
Main Index |
Thread Index |
Old Index