Subject: Re: Massive lossage with -current as of tonight?
To: None <current-users@netbsd.org>
From: Scott Ellis <scotte@warped.com>
List: current-users
Date: 11/04/2006 14:58:16
Eric Haszlakiewicz wrote:
> On Fri, Nov 03, 2006 at 11:25:57PM -0800, Scott Ellis wrote:
>> Well, after cvs updating and doing a complete rebuild (so using -current 
>> as of ~10pm PST Nov 3rd), I get the same behavior as before: Various 
>> programs appear to hang when booting multi-user.
>>
>> Going back to libc.so.12.147 "fixes" things (mostly), but sshd still 
>> fails, and now I see the new, even more exciting behavior that prevents 
>> logging in:
[snip]
> 	Given that you can fix your problem by reverting libc, and I
> haven't updated anything beyond the ipf binaries, we might have
> separate issues here.

Well, I'm starting to suspect some of the kauth changes here.

Booting a -current (Nov 4th, cvs updated moments ago) kernel works fine 
with the October 26th userland.

Updating to Nov 4th userland breaks just as it did when originally 
reported (stuff like named hanging on "load: 0.95  cmd: named 795 
[piperd] 0.00u 0.00s 0% 1808k", but being able to be ^C'ed).  My gut 
tells me this is really sh that's hanging, since we're really running 
through rc.local and the rc.d/ scripts at this point.  But I digress...

Reverting to Oct 26th binaries, but Nov 4th /lib and /usr/lib "mostly" 
works.  Most everything is functional (the system works more-or-less as 
expected) except for some weird permission problems.  For example, 
during boot I see:
raidctl: unable to open device file: raid0

And trying to run atactl (Oct 24th or Nov 4th) yields:
atactl: wd0: Operation not permitted

A ktrace of this shows:
499      1 atactl   NAMI  "/dev/rwd0d"
499      1 atactl   RET   open -1 errno 1 Operation not permitted

Using the Oct 24th libc (and other libs), this works fine.

I'm quickly running out of clues.  Can anyone suggest what additional 
debugging to collect, or what steps to take to try and root-cause this? 
  My build machine is only an Athlon64 3400+, so rebuilding userland for 
every day between Oct 24th and Nov 4th seems time prohibitive.

	ScottE