> I still say revert rmind's changes of 2011.04.11.22.31.43, because> that's when the failures started. My logs show six test runs > between christos' change to kern_descrip.c (at 2011.04.10.15.45.33) > and rmind's changes, and none of those test runs paniced; after > rmind's changes, every single test run has paniced.Problem is not diagnosed. It cannot be reproduced on real hardware, and I do not see how f_ops can become invalid when using semaphore. Even if we assume that it can - the semaphore code should actually be *used* in the first place. However, it seems that neither failing ATF tests, nor ATF itself are using semaphores. Can somebody prove me wrong on this? Perhaps a simple printf("f_type = %d\n", fp->f_type) would hint what type of descriptor is actually failing. Also, a wild guess - can one reproduce the problem with the following changes reverted:
I'm working on building kernels with each commit backed out - it will take a while.
However, I have been able to dump the file structure: f_offset 0000 0000 0000 0000 f_cred ffff 8000 098e db40 f_ops ffff ffff 80ef fac0 f_data ffff 8000 09de 28c0 f_list.next ffff 8000 0ab7 f4c0 f_list.prev ffff ffff 80cb 2728 f_lock 0000 0000 0000 0000 f_flag 0000 0003 f_marker 0000 0000 f_type 0000 0008 f_advice 0000 0000 f_count 0000 0000 f_msgcount 0000 0000 f_unpcount 0000 0000 f_unplist.next 0000 0000 0000 0000Note that both the f_ops and f_list.prev pointers seem to be corrupt, and that the type of this structure is semaphore = 8
------------------------------------------------------------------------- | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -------------------------------------------------------------------------