Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

memory corruption [Re: try KMGUARD]



Here's another panic analysis. This one is quite interresting because,
if I got it right, the corruption occurs in the kernel's bss and not
in memory allocated by kmem.

The panic was:
fatal protection fault in supervisor mode
trap type 4 code 0 rip ffffffff8088b610 cs 8 rflags 10202 cr2  ffff80006cbb0000 
cpl 0 rsp fffffe810cd96698
kernel: protection fault trap, code=0
Stopped in pid 19880.1 (vnconfig) at    netbsd:strcmp+0x70:     movb    
0(%rdi),%al
db{0}> tr
strcmp() at netbsd:strcmp+0x70
vndioctl() at netbsd:vndioctl+0xd48
bdev_ioctl() at netbsd:bdev_ioctl+0x77
VOP_IOCTL() at netbsd:VOP_IOCTL+0x3b
vn_ioctl() at netbsd:vn_ioctl+0x76
sys_ioctl() at netbsd:sys_ioctl+0x13c
syscall() at netbsd:syscall+0xac

I suspect the strcmp to be in fact in pool_init(). show all pools panic
in the same way, confirming this:
db{0}> show all pools
[...]
POOL buf4k: size 4096, align 8, ioff 0, roflags 0x00000000
        alloc 0xffffffff80e203a0
        minitems 1, minpages 1, maxpages 1, npages 25
        itemsperpage 1, nitems 1, nout 24, hardlimit 4294967295
        nget 32, nfail 0, nput 8
        npagealloc 33, npagefree 8, hiwat 26, nidle 1
POOLfatal protection fault in supervisor mode
trap type 4 code 0 rip ffffffff80719514 cs 8 rflags 10246 cr2  ffff80006cbb0000 
cpl 8 rsp fffffe810cd96060
kernel: protection fault trap, code=0
Faulted in DDB; continuing...

In the list, after buf4k we should have buf512b. Fortunably these
struct pool all comes from bmempools[]. we can easily get the
struct pool's address for buf4k and buf512b from here:
db{0}> sh pool 0xffffffff80e670a0
POOL buf4k: size 4096, align 8, ioff 0, roflags 0x00000000
        alloc 0xffffffff80e203a0
        minitems 1, minpages 1, maxpages 1, npages 25
        itemsperpage 1, nitems 1, nout 24, hardlimit 4294967295
        nget 32, nfail 0, nput 8
        npagealloc 33, npagefree 8, hiwat 26, nidle 1
db{0}> sh pool 0xffffffff80e66bc0
POOL buf512b: size 512, align 8, ioff 0, roflags 0x00000000
        alloc 0xffffffff80e203a0
        minitems 1, minpages 1, maxpages 1, npages 1
        itemsperpage 8, nitems 5, nout 3, hardlimit 4294967295
        nget 4, nfail 0, nput 1
        npagealloc 1, npagefree 0, hiwat 1, nidle 0

in struct pool, tqh_first is at offset 0 and tqh_last at offset 8:
db{0}> x/Lx 0xffffffff80e670a0
netbsd:bmempools+0x4e0: ffffffff80e66bc1
this doesn't look good, buf4k's struct pool is corrupted.
db{0}> x/Lx 0xffffffff80e670a8
netbsd:bmempools+0x4e8: ffffffff80e67580
db{0}> sh pool ffffffff80e67580
POOL buf32k: size 32768, align 8, ioff 0, roflags 0x00000000
        alloc 0xffffffff80e27320
        minitems 1, minpages 1, maxpages 1, npages 1
        itemsperpage 2, nitems 2, nout 0, hardlimit 4294967295
        nget 0, nfail 0, nput 0
        npagealloc 1, npagefree 0, hiwat 1, nidle 1

but this is correct; before buf4k we have buf32k in the list (which
is sorted alphabetically).

db{0}> x/Lx 0xffffffff80e66bc8
netbsd:bmempools+0x8:   ffffffff80e670a0
db{0}> sh pool ffffffff80e670a0
POOL buf4k: size 4096, align 8, ioff 0, roflags 0x00000000
        alloc 0xffffffff80e203a0
        minitems 1, minpages 1, maxpages 1, npages 25
        itemsperpage 1, nitems 1, nout 24, hardlimit 4294967295
        nget 32, nfail 0, nput 8
        npagealloc 33, npagefree 8, hiwat 26, nidle 1

buf512b's previous pointer is also correct.
db{0}> x/Lx 0xffffffff80e66bc0
netbsd:bmempools:       ffffffff80e67720
db{0}> sh pool ffffffff80e67720
POOL buf64k: size 65536, align 8, ioff 0, roflags 0x00000000
        alloc 0xffffffff80e27320
        minitems 1, minpages 1, maxpages 1, npages 1
        itemsperpage 1, nitems 1, nout 0, hardlimit 4294967295
        nget 0, nfail 0, nput 0
        npagealloc 1, npagefree 0, hiwat 1, nidle 1

and buf512b's next pointer is also correct.

So, most likely it's no a bad argument passed to a TAILQ_* macro,
but something really did write to buf4k's struct next pointer. What is
after the next pointer looks correct; lets looks at what is before:
db{0}> sh pool 0xffffffff80e66f00
POOL buf2k: size 2048, align 8, ioff 0, roflags 0x00000000
        alloc 0xffffffff80e203a0
        minitems 1, minpages 1, maxpages 1, npages 203
        itemsperpage 2, nitems 2, nout 404, hardlimit 4294967295
        nget 412, nfail 0, nput 8
        npagealloc 211, npagefree 8, hiwat 203, nidle 1
db{0}> x/Lx 0xffffffff80e67050
netbsd:bmempools+0x490: 0               #pr_log
db{0}> x/Lx 0xffffffff80e67058
netbsd:bmempools+0x498: 0               #pr_curlogentry
db{0}> x/Lx 0xffffffff80e6705c
netbsd:bmempools+0x49c: 0               #pr_logsize
db{0}> x/Lx 0xffffffff80e67060
netbsd:bmempools+0x4a0: 0               pr_entered_file
db{0}> x/Lx 0xffffffff80e67068
netbsd:bmempools+0x4a8: 0               pr_entered_line
db{0}> x/Lx 0xffffffff80e67070
netbsd:bmempools+0x4b0: 0               pr_reclaimerentry
db{0}> x/Lx 0xffffffff80e67078
netbsd:bmempools+0x4b8: 0
db{0}> x/Lx 0xffffffff80e67080
netbsd:bmempools+0x4c0: 0               
db{0}> x/Lx 0xffffffff80e67088
netbsd:bmempools+0x4c8: 0
db{0}>  x/Lx 0xffffffff80e67090
netbsd:bmempools+0x4d0: 0               pr_freecheck
db{0}>  x/Lx 0xffffffff80e67098 
netbsd:bmempools+0x4d8: 0               pr_qcache

they're all 0 as expected. So it looks like a single byte (or maybe
less) has been changed: ffffffff80e66bc8 has been changed to ffffffff80e66bc1.
And this is in the kenrel's bss, not memory managed by kmem or other
kernel memory allocators. So, to me, it looks more like an uninitialized
pointer which is being used somewhere, than use after free.

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index