current-users: Re: i386 snapshot panics all the time

Subject: Re: i386 snapshot panics all the time
To: None <thorpej@zembu.com>
From: Kazushi (Jam) Marukawa <jam@pobox.com>
List: current-users
Date: 06/06/2000 09:55:59
Hi,

I'm using Cyrix MII 300 and getting the same trouble.  I
haven't seen any messages about this issue in June.  So, I'm
sending this message to see what's going on now and tell my
experience.

   On May 29, 12:38, Jason R Thorpe wrote:
   > Subject: Re: i386 snapshot panics all the time
   > Please try the following for me:
   > 
   > Boot directly into DDB ("boot netbsd -d"), and:
   > 
   > db> w vm_page_zero_enable 0
   > db> c
   > 
   > Then edit sys/arch/i386/include/pmap.h, and look for the line:
   > 
   > #define PMAP_PAGEIDLEZERO(pa)   pmap_zero_page_uncached((pa))
   > 
   > Comment it out.  Build a new kernel; the sources I used to build the
   > snapshot are on ftp.netbsd.org with the snapshot, now.  Install it and
   > boot it.
   > 
   > Hammer on it as usual.  Please report if the system is more stable (I
   > suspect it will be), and also report if system performance is on the
   > order of as good as with the uncached zero hook.

I edited the NetBSD like above and compiled it.  My NetBSD
is checked out few hours ago through annoncvs.  Then, I
started the compilation of whole user land with this new
kernel.  NetBSD crashs again.

kernel: page fault trap, code=0
Stopped at      pmap_zero_page+0x39:    repe stosl
%es:(%edi)
db> t
pmap_zero_page(57b6000,0,7fffffff,c9e8ec98,c9e8ec98) at
pmap_zero_page+0x39
uvm_pageidlezero(e000ffe7,c9e8ec98,0,7fffffff,c019c803) at
uvm_pageidlezero+0x8c

idle(c9e8ec98) at idle+0x1b
bpendtsleep(c2cd8bec,11,c03e4ef5,0) at bpendtsleep
biowait(c2cd8bec,f097,c9b14acc,c922ac00,c0499fa0) at
biowait+0x2f
bwrite(c2cd8bec,0,c9b149f8,20,c9eb1c88) at bwrite+0x10a
ffs_update(c9eb1cb8,c9eb1e20,c9eb1edc,c9eb1e0c,c9eb1cb8) at
ffs_update+0x24c
ufs_makeinode(8180,c9dcd1a4,c9eb1eec,c9eb1f00,c9eb1e84) at
ufs_makeinode+0x242
ufs_create(c9eb1e0c,bfbfb5e4,c9eb1f88,602,c08f3800) at
ufs_create+0x2a
vn_open(c9eb1edc,602,180,c9eb1f88,c9e8ec98) at vn_open+0xe8
sys_open(c9e8ec98,c9eb1f88,c9eb1f80,0,0) at sys_open+0xca
syscall() at syscall+0x234
--- syscall (number 5) ---
0x48119577:
db> print $ecx
     400
db> print $edi
c0602000
db> c

    I continued kernel.  My NetBSD continues the
    compilation.  After for a while, NetBSD crashs again.
    However, the time period between both crashs is little
    bit longer than the period with old non-edited NetBSD.
    I'm not sure, but I feel so.

kernel: page fault trap, code=0
Stopped at      pmap_zero_page+0x39:    repe stosl
%es:(%edi)
db> t
pmap_zero_page(47f6000,0,7fffffff,c9e8ec98,c9e8ec98) at
pmap_zero_page+0x39
uvm_pageidlezero(e000ffe7,c9e8ec98,0,7fffffff,c019c803) at
uvm_pageidlezero+0x8c

idle(c9e8ec98) at idle+0x1b
bpendtsleep(c2cbed4c,11,c03e4e3a,0,0) at bpendtsleep
getblk(c9d8b2a4,71,2000,0,0) at getblk+0x92
cluster_read(c9d8b2a4,134b8c,0,71,2000) at cluster_read+0x3f
ffs_read(c9ec1e9c,c9ec1f88,2000,c9e2eea4,c9ec1e9c) at
ffs_read+0x254
vn_read(c9e2eea4,c9e2eec0,c9ec1ee8,c08d5400,1) at
vn_read+0xba
dofileread(c9e8ec98,3,c9e2eea4,8055000,2000) at
dofileread+0x93
sys_read(c9e8ec98,c9ec1f88,c9ec1f80,0,48151f60) at
sys_read+0x67
syscall() at syscall+0x234
--- syscall (number 3) ---
0x4813c2c3:
db> print $ecx
     400
db> print $edi
c0602000
db> w vm_page_zero_enable 0
vm_page_zero_enable                    0x1 =          0
db> c

    This time, I changed vm_page_zero_enable to zero with
    hope this may help me.  Kernel crashs again with
    different message "uvm_fault(0xc04f14e0, 0xc0602000, 0, 3) -> 1."

db> c
uvm_fault(0xc04f14e0, 0xc0602000, 0, 3) -> 1
kernel: page fault trap, code=0
Stopped in sh at        pmap_zero_page+0x39:    repe stosl
%es:(%edi)
db> t
pmap_zero_page(2402000,0,948a000,c04f1480,8) at
pmap_zero_page+0x39
uvm_pagealloc_strat(c04f1480,948a000,0,0,2) at
uvm_pagealloc_strat+0x2c2
uao_get(c04f1480,948a000,0,c9ececa8,c9ececa0) at
uao_get+0x101
uvm_fault(c04f14e0,c948a000,0,3,0) at uvm_fault+0x830
trap() at trap+0x431
--- trap (number 6) ---
sys_execve(c9eb67ec,c9ecef88,c9ecef80,0,80b20b8) at
sys_execve+0x160
syscall() at syscall+0x234
--- syscall (number 59) ---
0x806d4d7:
db> print ecx
Symbol not found
db> print $ecx
     400
db> print $edi
c0602000
db> c

    Then, I continued.  The NetBSD continues the compilation
    for a while, then it hangs.  It replys to ping.
    However, I cannot do any interactive works, so I reboot
    my machine from kernel debugger.  That's all of my
    experience.

    Now, I'm using edited and re-compiled kernel without
    changing vm_page_zero_enable to zero.  It crashs
    sometimes, but I can continue if I was in the home.  I
    hope someone can solve this problem soon.

   On May 29, 22:28, Jukka Marin wrote:
   > Subject: Re: i386 snapshot panics all the time
   > On Mon, May 29, 2000 at 12:15:13PM -0700, Jason R Thorpe wrote:
   > >  > Not sure of the relevance, but my problem box is also a Cyrix:
   > >  > 
   > >  > cpu0: family 6 model 0 step 0
   > >  > cpu0: Cyrix 6x86MX (686-class)
   > > 
   > > It probably is.
   > 
   > Hey, all we need is a bunch of good monkeys to type "continue<ENTER>"
   > whenever the system panics! :-)
   > 
   >   -jm

Yes.  I'm doing the same monkey businnes to compile whole
user land again and again.  However, is this really safety?


   On May 30, 20:10, Manuel Bouyer wrote:
   > Subject: Re: i386 snapshot panics all the time
   > Just an idea: could peoples with this vm_page_zero() problem try to disable
   > the internal cache and see how it goes ?
   > I've an older cyrix CPU which has problems under -current with the internal
   > cache enabled ... it doesn't panic in the same way though.

I've not installed VGA card, so I cannot do it now.  I'll do
it if nobody cannot solve this problem, but I don't like
this way.  Is there any way to get old stable pmap.c?

Thanks,
-- Kazushi
Absent, adj.:
	Exposed to the attacks of friends and acquaintances; defamed;
slandered.