Re: memory corruption [Re: try KMGUARD]

To: Mindaugas Rasiukevicius <rmind%NetBSD.org@localhost>
Subject: Re: memory corruption [Re: try KMGUARD]
From: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
Date: Mon, 12 Mar 2012 22:16:07 +0100

On Mon, Mar 12, 2012 at 05:00:59PM +0100, Manuel Bouyer wrote:
> Here's another panic analysis. This one is quite interresting because,
> if I got it right, the corruption occurs in the kernel's bss and not
> in memory allocated by kmem.

Another one, also involving memory in bss:
uvm_fault(0xffffffff80e658e0, 0xfffffe82bfcd9000, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 rip ffffffff80718057 cs 8 rflags 10246 cr2  fffffe82bfcd9c08 
cpl 6 rsp fffffe810c81b2d0
kernel: page fault trap, code=0
Stopped in pid 17923.1 (tar) at netbsd:pool_cache_get_paddr+0x6d: movl 
8(%rax),%edx
db{0}> tr
pool_cache_get_paddr() at netbsd:pool_cache_get_paddr+0x6d
vmem_alloc() at netbsd:vmem_alloc+0xbe
uvm_km_kmem_alloc() at netbsd:uvm_km_kmem_alloc+0x47
kmem_intr_alloc() at netbsd:kmem_intr_alloc+0x5f
kmem_intr_zalloc() at netbsd:kmem_intr_zalloc+0xf
puffs_msgmem_alloc() at netbsd:puffs_msgmem_alloc+0x105
puffs_vnop_strategy() at netbsd:puffs_vnop_strategy+0x104
VOP_STRATEGY() at netbsd:VOP_STRATEGY+0x55
genfs_do_io() at netbsd:genfs_do_io+0x1b9
genfs_gop_write() at netbsd:genfs_gop_write+0x55
genfs_do_putpages() at netbsd:genfs_do_putpages+0x712
VOP_PUTPAGES() at netbsd:VOP_PUTPAGES+0x5c
flushvncache() at netbsd:flushvncache+0x6c
puffs_vnop_inactive() at netbsd:puffs_vnop_inactive+0xae
VOP_INACTIVE() at netbsd:VOP_INACTIVE+0x55
vrelel() at netbsd:vrelel+0x239
vn_close() at netbsd:vn_close+0x42
closef() at netbsd:closef+0x54
fd_close() at netbsd:fd_close+0x1b6
syscall() at netbsd:syscall+0xac
db{0}> sh reg
ds          c128
es          1c1d
fs          b360
gs          1c02
rdi         6
rsi         1
rbp         fffffe810c81b320
rbx         ffffffff80e5e980    static_qc_pools+0x700
rdx         0
rcx         c
rax         fffffe82bfcd9c00
r8          fffffe81bc2dce48
r9          1
r10         0
r11         0
r12         ffffffff80e5ebc0    static_qc_pools+0x940
r13         0
r14         0
r15         fffffe810c81b2e8
rip         ffffffff80718057    pool_cache_get_paddr+0x6d
cs          8
rflags      10246

pool_cache_get_paddr+0x6d is subr_pool.c:2468
        if (__predict_true(pcg->pcg_avail > 0)) {

pgc is corrupted (it's %rax, fffffe82bfcd9c00). If I got the assembly
right, pc is in %rbx, so it's ffffffff80e5e980.
db{0}> sh pool ffffffff80e5e980
POOL CACHE kva-12288: size 12288, align 4096, ioff 0, roflags 0x00000e00
        alloc 0xffffffff80e5c430
        minitems 0, minpages 0, maxpages 4294967295, npages 16
        itemsperpage 10, nitems 2, nout 158, hardlimit 4294967295
        nget 158, nfail 0, nput 0
        npagealloc 16, npagefree 0, hiwat 16, nidle 0
        cpu layer hits 50 misses 163
        cache layer hits 5 misses 158
        cache layer entry uncontended 163 contended 0
        cache layer empty groups 0 full groups 0

cc = pc->pc_cpu[0] is:
db{0}> x/Lx 0xffffffff80e5ec00                                                  
netbsd:static_qc_pools+0x980:   ffffffff80e5ebc0
(this is pc->pc_cpu0, it's OK).

pcg = cc->cc_current is:
db{0}> x/Lx 0xffffffff80e5ebd0
netbsd:static_qc_pools+0x950:   fffffe82bfcd9c00

this is wrong. So once again, something wrote to the bss ...

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--

Follow-Ups:
- Re: memory corruption [Re: try KMGUARD]
  - From: Manuel Bouyer

References:
- [no subject]
  - From: Paul Goyette
- Re: Test failures
  - From: Paul Goyette
- Re: try KMGUARD
  - From: Mindaugas Rasiukevicius
- Re: try KMGUARD
  - From: Paul Goyette
- Re: try KMGUARD
  - From: Mindaugas Rasiukevicius
- Re: try KMGUARD
  - From: Manuel Bouyer
- Re: try KMGUARD
  - From: Manuel Bouyer
- Re: try KMGUARD
  - From: Manuel Bouyer
- Re: try KMGUARD
  - From: Manuel Bouyer
- memory corruption [Re: try KMGUARD]
  - From: Manuel Bouyer

Prev by Date: Re: building 'current' amd64 on linux
Next by Date: Re: building 'current' amd64 on linux
Previous by Thread: memory corruption [Re: try KMGUARD]
Next by Thread: Re: memory corruption [Re: try KMGUARD]
Indexes:

Home | Main Index | Thread Index | Old Index