Subject: Memory clobber [was netbsd 1.5 (softdep?) crash]
To: None <collver@softhome.net, current-users@netbsd.org, jmaier@midamerica.net>
From: Michael South <msouth@scruz.net>
List: port-i386
Date: 01/07/2001 13:58:55
Something is Rotten in Denmark.  Houstin, We Have a Problem.  Our Memory
is Being Clobbered.  Action must be Taken!  Something Must be Done!!
etc..

Too many reports of weird memory problems on recent kernels.  I'm going
back
through the archives to pull them together.  So far I have:

- John Maier, getblk: block invarient failed   1.5 beta 2, Compaq
Armada.

- Myself, various memory-ish faults, frequently involving uvm_fault.
  1.5Q. Sony Vaio laptop.  With and without softdeps compiled in.
  1.5 beta 1 OK.

- Ben Collver, vput ref cnt and uvm_fault.


Common denominators:

* Most appear to be post-1.5, although John's is from 1.5 beta 2.

* With and without softdeps.

* i386 only

* Work arounds involve things which would change memory footprint
  (setting REALEXTMEM, disabling INET6)


Ben, exactly which version of 1.5 are you running.  What's the hardware?


I've been able to pretty easily reproduce the problem on my machine.
Does anyone have suggestions about extra kprintfs or asserts to track
this down?

Mike


collver@softhome.net wrote:
> 
> On Sun, Jan 07, 2001 at 08:29:55AM -0800, collver@softhome.net wrote:
> > I was building Perl on a newly installed NetBSD 1.5 system with softdeps
> > enabled.  During the build, the kernel panicked and I am including the
> > message in this mail.  Will someone advise me?
> >
> > vput: bat ref count: tag 1 type VREG, usecount 0, writecount 1, refcount 7,
> >    tag VT_UFS, ino 3503760, on dev 0, 0 flags 0x6, effnlink 1, nlink 1
> >    mode 0100644, owner 0, group 0, size 56244, lock type vnlock: EXCL (count 1) by pid 8086
> > panic: vput: ref cnt
> > Stopped in miniperl at cpu_Debugger+0x4: leave
> > db>
> 
> I tried to build Perl again and this time got a different crash.
> 
> uvm_fault(0xc02c2ba0, 0xc8aaa000, 0, 1) -> 1
> kernel: page fault trap, code = 0
> Stopped in sh at        pmap_page_remove+0x114: movl            0(%edx, %eax, 4), %eax
> db>
> 
> This is the first time I've seen NetBSD 1.5 crashing like this, I suspect
> the hardware.  Do these error messages contain any hints about which
> hardware components I should suspect?  My first guess would be the memory,
> however this PC has run several other operating systems without crashing.
> 
> Ben
> --
> Code softly and carry a big debugger.