Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: memory corruption [Re: try KMGUARD]



On Mon, Mar 12, 2012 at 10:16:07PM +0100, Manuel Bouyer wrote:
> On Mon, Mar 12, 2012 at 05:00:59PM +0100, Manuel Bouyer wrote:
> > Here's another panic analysis. This one is quite interresting because,
> > if I got it right, the corruption occurs in the kernel's bss and not
> > in memory allocated by kmem.
> 
> Another one, also involving memory in bss:

And another one. It looks like all corruptions occurs in bss for me ...

panic: kernel diagnostic assertion "cc->cc_cache == pc" failed: file 
"/dsk/l1/misc/bouyer/quota2/src/sys/kern/subr_pool.c", line 2466 
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff80256715 cs 8 rflags 246 cr2  7f7fe9200000 cpl 6 
rsp fffffe810bcaf7b0
Stopped in pid 12958.6 (t_copy) at      netbsd:breakpoint+0x5:  leave
db{5}> tr
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x1f2
kern_assert() at netbsd:kern_assert+0x48
pool_cache_get_paddr() at netbsd:pool_cache_get_paddr+0x195
amap_alloc1() at netbsd:amap_alloc1+0x3d
amap_alloc() at netbsd:amap_alloc+0x43
amap_copy() at netbsd:amap_copy+0x2fc
uvm_fault_internal() at netbsd:uvm_fault_internal+0xfae
trap() at netbsd:trap+0x59c
--- trap (number 6) ---

(gdb) l *(amap_alloc1+0x3d)
0xffffffff807f8b6d is in amap_alloc1 
(/dsk/l1/misc/bouyer/quota2/src/sys/uvm/uvm_amap.c:175).
170             const bool nowait = (flags & UVM_FLAG_NOWAIT) != 0;
171             const km_flag_t kmflags = nowait ? KM_NOSLEEP : KM_SLEEP;
172             struct vm_amap *amap;
173             int totalslots;
174
175             amap = pool_cache_get(&uvm_amap_cache, nowait ? PR_NOWAIT : 
PR_WAITOK);
176             if (amap == NULL) {
177                     return NULL;
178             }
179             totalslots = amap_roundup_slots(slots + padslots);

here we have pc = &uvm_amap_cache in pool_cache_get_paddr.
db{5}> print uvm_amap_cache
ffffffff80e7ce40
db{5}> x/Lx 0xffffffff80e7d0e8
netbsd:uvm_amap_cache+0x2a8:    fffffe810afa4581 #uvm_amap_cache.pc_cpus[5]
This is wrong, obviously, for a pointer to a pool_cache_cpu_t *.
db{5}> x/Lx 0xfffffe810afa45a1
fffffe810afa45a1:       ffffffff80e7ce

Here are the pc_cpus for all 8 cpus:
db{5}> x/Lx 0xffffffff80e7d0c0,8
netbsd:uvm_amap_cache+0x280:    ffffffff80e7d080        fffffe810af56f80
netbsd:uvm_amap_cache+0x290:    fffffe810af57d00        fffffe810af5aa80
netbsd:uvm_amap_cache+0x2a0:    fffffe810afa0800        fffffe810afa4581
netbsd:uvm_amap_cache+0x2b0:    fffffe810af6a300        fffffe810afde080

And the cc_cache for each of them:
db{5}> x/Lx ffffffff80e7d0a0
netbsd:uvm_amap_cache+0x260:    ffffffff80e7ce40
db{5}> x/Lx fffffe810af56fa0
fffffe810af56fa0:       ffffffff80e7ce40
db{5}> x/Lx fffffe810af57d20
fffffe810af57d20:       ffffffff80e7ce40
db{5}> x/Lx fffffe810af5aaa0
fffffe810af5aaa0:       ffffffff80e7ce40
db{5}> x/Lx fffffe810afa0820
fffffe810afa0820:       ffffffff80e7ce40
db{5}> x/Lx fffffe810af6a320
fffffe810af6a320:       ffffffff80e7ce40
db{5}> x/Lx fffffe810afde0a0
fffffe810afde0a0:       ffffffff80e7ce40

and, if pc_cpus[5] was fffffe810afa4580 and not fffffe810afa4581:
db{5}> x/Lx fffffe810afa45a0
fffffe810afa45a0:       ffffffff80e7ce40

So it looks like, once again, a single bit was changed in the kernel's BSS.

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index