port-arm26: Re: R140, now mounting root over NFS

Subject: Re: R140, now mounting root over NFS
To: None <port-arm26@netbsd.org>
From: Kjetil B. Thomassen <kjetil@thomassen.priv.no>
List: port-arm26
Date: 12/10/2000 14:10:38
On Sat 09 Dec, Ben Harris wrote:
> On Sun, 3 Dec 2000, Kjetil B. Thomassen wrote:
> 
> > I made myself a test kernel were I added DDB and strip out some of the
> > things I didn't think I needed. The config file has been attached to
> > this email.
> > 
> > The last stuff before it hang is:
> > root file system type: nfs
> > init: copying out path '/sbin/init' 11
> > 
> > I broke into DDB with CTRL-ALT-Esc
> 
> Is it possible for you to try typing "x/i 0,8" at DDB here?  This will
> dump the exception vectors, and on my system, I find that the vector at
> 0x8 (the SWI vector) has been corrupted to "andeq r0,r0,r0" (ie all
> zeroes).  It looks like something in the kernel is writing through a NULL
> pointer.  That'll be fun to debug.

The vectors are not corrupted on my R140. They look fine to me.

In an earlier email you also said:
On Mon 04 Dec, Ben Harris wrote:
> On Sun, 3 Dec 2000, Kjetil B. Thomassen wrote:
> 
> > The last stuff before it hang is:
> > root file system type: nfs
> > init: copying out path '/sbin/init' 11
> 
> Odd.  My system was panicking at that point until I put in a small change
> (not committed because I don't think it's correct) which caused it to hang
> instead.  I think that UBC currently has a problem with systems with >8k
> page size.  I'll raise this on tech-kern if I get a chance, but at least
> one place where the problem is obvious is in (from memory) ubc_fault(),
> where it does an integer divide of UBC_WINSIZE (8192) by PAGE_SIZE (32768)
> and uses the result as the number of pages to fault in (or something like
> that).
> 
> > It looks as though there may be problem with I/O, so this is something
> > that needs to be looked into.
> 
> I think it's a VM system problem.  I'll look into it when I get a chance,
> but that may not be till the weekend.  It takes me quite a lot of
> concentration to understand that area of the kernel.
> 
> > Is there anything I can do to try to trace this further?
> 
> options UVMHIST tends to be quite useful for this kind of thing.  Look in
> sys/uvm/uvm_stat.c for stuff you can do with it.  UBC keeps its own
> history, which ISTR you can get at by typing "call uvmhist_dump(ubchist)"
> at db>.

Yes, I got several pages of output, and I couldn't understand much of
it. Also, the stuff in uvm_stat.c is above my head, so I think I need to
understand more of this before I can do anything more.

I used sources from some time yesterday, but it did not get any further
than it has done before. The R140 is up and running as it has mounted
the root directory and is answering when I ping it. The delay is around
5 ms.

Is there anything else I can do in DDB to try to track this down?

TIA!

Kjetil B.
mailto:kjetil@thomassen.priv.no
http://www.thomassen.priv.no/