Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: 5.0beta won't boot on a Dell Dimension 8400



On Tue, 2 Dec 2008 11:07:25 +0000
Andrew Doran <ad%netbsd.org@localhost> wrote:

> On Mon, Dec 01, 2008 at 11:01:45PM -0500, Steven M. Bellovin wrote:
> 
> > On Sun, 30 Nov 2008 09:50:04 +0000
> > Andrew Doran <ad%netbsd.org@localhost> wrote:
> > 
> > > On Sat, Nov 29, 2008 at 03:59:06PM -0500, Steven M. Bellovin
> > > wrote:
> > > 
> > > > NStopped in pid 0.2 (system) at
> > > > netbsd:sse2_idlezero_page+0x18: jnz
> > > > netbsd:sse2_idlezero_page+0x46
> > > > 
> > > > Is there some kernel option or patch I could try to make the
> > > > kernel go to the backtrace function immediately?  
> > > 
> > > Look for this line in arch/i386/i386/trap.c:
> > > 
> > >   printf ("NMI ... going to debugger\n");
> > > 
> > > Please change it to:
> > > 
> > >   printf ("NMI ... going to debugger cr4=%lx\n",
> > > (long)rcr4());
> > > 
> > > It may be that SSE is not enabled for some reason.
> > > 
> > cr4=690
> > 
> > Btw -- both machines have the same BIOS version, A03 from 2004.
> > A09 is available; it's about two years newer.  I wonder if
> > installing it would make life better or worse...
> 
> SSE is enabled. If it were my machine the next thing I would try
> updating the BIOS

I updated the BIOS to A09, the latest one available from Dell.  It
changed the nature but not the fact of the failure.  

At boot time, I now got

        piixide0:0:0 lost interrupt
                type: ata tc_bcount: 512 tc_skip: 0

followed by misidentifying the drive:

        wd0 at atabus1 drive 0: <st506>

and saying it's about 67MB.  (it's actually <WDC WD2500JD-75HBC0> and
232 GB...)

That, in turn, is followed by a more or less infinite series of

wd0d: device timeout reading fsbn (wd0 bn 0; cn 0 tn 0 sn 0), retrying
ahcisata0 port 0: device present, speed: 1.5Gb/s

*But* -- I found a workaround that may be a useful clue.  I reverted
ahcisata_core.c to 1.17, and the system booted just fine with 5.99.6.
I suspect some subtle timing issue in 1.18, which is why it always
works on another, nominally identical machine, but fails on this one.

> and then perhaps compare output of DEBUG_MEMLOAD
> between the machines, and maybe disable load of some segments.
> 
I suspect that that is no longer worth trying.


                --Steve Bellovin, http://www.cs.columbia.edu/~smb


Home | Main Index | Thread Index | Old Index