tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

something really screwed up with mmap+ffs on 5.0_STABLE



I've been looking at some quite weird behaviour with mmapped files on ffs.
I want to concentrate on something else for a while, so here's a brain
dump of what I've been struggling with recently, in case it rings a bell
for someone or they even know the solution.


        Background:

The shmif rump driver provides a networkin backend using the old
mmap-a-file-to-get-a-handle trick.


        Observations:

Most of the time the problem is that the first 16k of the bus file gets
corrupted.  The underlying fs blocksize is 32k.  I have verified that:

a) it does not get written to by the involved processes per ktrace -i
b) processes do not overwrite random memory by having a
   PROT_NONE red zone in front

This problem does not happen on tmpfs.  I don't believe there is a timing
issue because I've run the test tens of thousands of times with varying
background load.

Zero-filling the bus file with write() instead of creating a sparse with
truncate doesn't make much of a difference either.  I was almost sure
it was a problem with the genfs "sawhole" code, but nope.

Usually after the bus has seen one generation (i.e. the pages have been
faulted in to all processes) there are no further problems.  However,
causing (read) faults from a 3rd party process not involved with the
test may trigger the problem.


        The really spooky stuff:

Seems like it's possible to get two "views" into the same file depending
on read/write or mmap access (whatever happened to mr. ubc???).
Can someone explain this:

> ./dumpbus-mmio -h thank-you-driver-for-getting-me-here
bus version 2, lock: 0, generation: 431, firstoff: 0x5a95a, lastoff: 0x5a8ea
> ./dumpbus-read -h thank-you-driver-for-getting-me-here
dumpbus-read: thank-you-driver-for-getting-me-here not a shmif bus

i.e. same file, but "magic" number doesn't match when not using mmap.
hexdump uses read() (per ktrace), so I get the "garbage" version of the
file with it and can confirm it indeed has gargabe in it.

The only difference between the two programs is this:
#if 1
        read(fd, buf, BUFSIZE);
        bmem = (void *)buf;
#else
        busmem = mmap(NULL, sb.st_size, PROT_READ, MAP_FILE|MAP_SHARED, fd, 0);
        if (busmem == MAP_FAILED)
                err(1, "mmap");
        bmem = busmem;
#endif

However, I can restore the old version using cp (since it uses mmio):

> ./dumpbus-read -h thank-you-driver-for-getting-me-here 
dumpbus-read: thank-you-driver-for-getting-me-here not a shmif bus
> cp thank-you-driver-for-getting-me-here backup
> ./dumpbus-read -h backup
bus version 2, lock: 0, generation: 431, firstoff: 0x5a95a, lastoff: 0x5a8ea


        How-to-repeat:

Get tests/net/icmp from -current and run "./t_ping floodping" in a loop
from ffs.  You should see the problem within a few thousand iterations.
Most likely the shmif code will encounter an invariant failure, such as:
panic: kernel diagnostic assertion "busmem->shm_magic == SHMIF_MAGIC" failed: 
file "if_shmem.c", line 391


I plan to update to latest -STABLE soon and see if the problem is still
present there.  Guess I'll reboot now...


Home | Main Index | Thread Index | Old Index