Port-arm archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: OMAP2/ARM1136 cache aliasing issue




On Jul 18, 2008, at 8:03 AM, Imre Deak wrote:

Hi all,

I have an H4/OMAP2420/ARM1136r0p2 board. The data cache is 32k/4way
cache line size is 32 bytes. I bumped into a memory corruption bug, where a 4 byte location of the kernel code segment got corrupted. This can't be the result of a stray pointer store since pages containing code are read only leading to an exception when a store is attempted to such an address. (I also set a hardware memory write breakpoint at the given address but it
never triggered).

The bug is easily reproducible, happens always at the same place with
the same value and I narrowed it down to 5 instructions. I can't trigger at the first instruction to see as the bug happens, I can only trigger at the last one when I detect the corruption. I read the data cache tags and TLB content with CP15 instructions both before and after the 5 instructions. There is (of course) no store to the code segment location which is corrupted, still I see the relevant cache line getting dirtied as the result of these 5 instructions. Storing occurs only to a different address but the value stored matches with the value at the corrupted location. Also the two virtual addresses (the one being stored to and the one getting corrupted) have the same cache index, but the physical addresses are different. The mapping for the address being stored to is 4k the mapping for the corrupted address is 64k
big.

Following are the 5 instructions leading to the corruption. There are no
interrupts or exceptions during its execution:

r0: 82df050c
sp: 82e17e14
ip: 82e17e14

8045fdc4:       e59f302c        ldr     r3, [pc, #44]   ; 8045fdfc
8045fdc8:       e92dd800        stmdb   sp!, {fp, ip, lr, pc}
8045fdcc:       e24cb004        sub     fp, ip, #4
8045fdd0:       e24dd010        sub     sp, sp, #16
8045fdd4:       e50b0014        str     r0, [fp, #-20]  ; 82e17dfc

..


Relevant mappings:

VA:82e17000 -> PA:83cda000 4k, outer write-back, no allocate on write,
                                   supervisor read/write
VA:80450000 -> PA:80450000 64k, outer write-back, no allocate on write,
                                   non-accessible

Can I suggest changing pmap_map_chunk to not use large pages and see if the
corruption still happens.  Just #if 0 the if at ~5578 in pmap.c

Of course that corruption shouldn't have happened and it may very well be a silicon bug.

Home | Main Index | Thread Index | Old Index