Subject: port-i386/11859: Memory paging corruptions under 1.5P w/o REALEXTMEM
To: None <gnats-bugs@gnats.netbsd.org>
From: Michael South <msouth@scruz.net>
List: netbsd-bugs
Date: 12/30/2000 21:44:16
>Number:         11859
>Category:       port-i386
>Synopsis:       Mem paging corruptions: 1.5P + Vaio Z505RX - REALEXTMEM
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    port-i386-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Dec 30 21:42:01 PST 2000
>Closed-Date:
>Last-Modified:
>Originator:     Michael South
>Release:        2000-12-29
>Organization:
>Environment:
Sony Vaio Z505RX.  128 MB RAM.  1.5P kernel.

>Description:
Works perfectly on 1.5.1_Alpha.  When run under 1.5P it quickly
dies when asked to do anything strenuous, unless option REALEXTMEM
is specified "not too large"--28672 (28 MB) works, 49152 (48 MB)
doesn't.

Death usually manifests as one or two userland processes dieing
w/ seg faults, bus faults, etc.  Then the kernel croaks.  Usually
too far gone for sync or core dump.  Some examples (all without
REALEXTMEM):

(System is booting, dies during rc processing:)
Building databases...
(grep dies w/ sig 11)
Seg fault
uvm_fault (0xc03465a0, 0xedfc3000, 0, 1) -> 1
kernel: page fault trap, code=0
stopped in pid 92 (sh) at ufs_close+0xa  movel 0x4(%eax),%eax
>How-To-Repeat:
boot in single-user mode

mount /usr; mount /var
cd /usr/src
make includes

Dies within a couple of seconds

>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:
 >sync
 syncing disks... kernel: protectection fault trap, code=0
 stopped in pid 92 (st) at xspllower+0x12  andl ipending,%eax
 
 Another example:
 
 Building databases...
 (grep sig10)
 Bus error
 uvm_fault(0xcae02f0, 0x89804000, 0, 1) -> 1
 kernel: page fault trap, code=0
 stopped in pid 92 (sh) at ufs_access + 0x7c   pusl 0xcc(%ecx)
 >show map
 MAP 0xc02462c8: [0x56570cec -> 0x8458b53]
   #ent = 1099629638, sz=-793214200, ref= 285212672,
   version= 214993248, flags=0x8b0f7401
 page faulted, in DDB
 >show page
 PAGE 0xc02462c8:
   flags= 8dc3 <BUSY,WANTED,FAKE,RDONLY,ZERO>,
   pqflags= 83e5 <FREE,ACTIVE,AOBJ>
   vers= 118, wire_count= 35157, pa= 0x8458b53
   uobject= 0xf3d59be8, uanon= 0x607bff50, offset= 0xc95f5e5bf4658dff
   loan_count= 1448545516
   [page ownership tracking disabled]
 >show uvmexp
 pagesize= 4096 (0x1000), pagemask= 0xfff, pageshift= 12
 31495 VM pages: 916 active, 0 inactive, 23 wired, 28172 free,
   136 anon, 780 vnode, 0 vtext
 freemin= 64, free-target= 85, inactive-target= 0, wired-max= 10498,
 faults= 4182, traps= 4450, intrs= 2111, ctxswitch= 924
 softint= 534, syscalls= 4003, swapins= 0, swapouts= 0
 fault counts:
   (noram, nanonon, pgwait, pgrele all 0)
   ok relocks (total)= 258(258), anget (retrys)= 1832(0), amapcopy= 875
   neighbor anon/obj pg= 1341/4084, gets (lock/unlock)= 1524/258
   cases: anon= 1075, anoncow= 757, obj= 1479, prcopy= 45, przero= 656
   daemon and swap counts:
     (woke, revs, scans, obscans, anscans, busy, freed, reactive,
     deactivate, pageouts, pending, nswget all 0)
     noswapdev= 1, nanon= 66942, nanonneeded= 66942, nfreeanon= 66806
     swpages= 37499, swpginuse= 0, swpgonly= 0, paging= 0
 kernel pointers:
   objs (kern/kmem/mb) = 0xc0346520/0xc0346630/0xc0346648