Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Asking for path to modules (was: How do I keep testing current-amd64 witout so much trouble?)



On Fri, 26 Dec 2008, Christos Zoulas wrote:

On Dec 26,  7:09am, paul%whooppee.com@localhost (Paul Goyette) wrote:
-- Subject: Re: Asking for path to modules (was: How do I keep testing curren

| On Thu, 25 Dec 2008, Christos Zoulas wrote:
|
| > Try a lockdebug kernel. I get spinouts when I do that.
|
| The crash happens shortly after (within 1 to 5 seconds) the prompt for
| root device is printed, and does not matter if I've typed a character or
| not.  I'm unable to get a crash dump, since both "sync" and "reboot
| 0x100" hang after printing the message "syncing disks...".
|
| Information from the spinout, manually annotated with kernel addresses
| decoded using GDB.
|
| lock address: ffffffff80594c40     type:                   spin
|                == kernel_lock
| initialized:  ffffffff801a6097
|                == main + 0x27
| shared holds:                0     exclusive:                 1
| shares wanted:               0     exclusive:                 1
| current cpu:                 0     last held:                 3
| current lwp:  ffff80004b2aa000     last held:  ffffffff80552d60
|                                                 == lwp0
| last locked:  ffffffff801f9d0d     unlocked:   ffffffff801ae68b
|                == sleepq_block + x18d       == intr_biglock_wrapper+x2b
| curcpu holds:                0
|
| So, it seems that something is holding onto the kernel's biglock and
| sleeping, while the init process wants the lock and isn't willing to
| wait for it?
|
| Question:  Does cngetsn() take out the kernel's biglock?  It does not
| appear to do so at first glance.

I don't know. but I think that the simplest solution for LOCKDEBUG
kernels is to avoid the issue by not allowing spinouts while we
are sleeping for input, using a global variable. It is ugly, but I
don't have a better idea right now.

That would take care of the LOCKDEBUG kernel panic, but wouldn't deal with the real problem, at least not _my_ real problem:

        something (lastheld == lwp0 ?) appears to be holding onto
        kernel_lock while cngetsn() is trying to wait for input and
        wants the lock.

What I really don't understand is why the addition of a single printf() and a single cngetsn() call very early in life (module_init() is one of the earliest things to be called by main()) leaves things in such a confused state by the time we get around to calling vfs_mountroot().

I'm going to see if some extra printf()s can help me make some forward progress.


-------------------------------------------------------------------------
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:      |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
|                  |                          | pgoyette at netbsd.org  |
-------------------------------------------------------------------------


Home | Main Index | Thread Index | Old Index