NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/55177: mremap(MAP_REMAPDUP) fails after fork()



The following reply was made to PR lib/55177; it has been noted by GNATS.

From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: 
Subject: Re: kern/55177: mremap(MAP_REMAPDUP) fails after fork()
Date: Fri, 3 Mar 2023 12:48:22 +0000

 What's happening is:
 
 1. mmap creates an entry in the process's VM map at p for a new
    anonymous object, call it A.
 
 2. mremap(MAP_REMAPDUP) creates an entry in the process's VM map at q
    for the same anonymous object A.
 
 3. On the next write to p, the CPU will fault, and NetBSD will
    allocate a backing page for A, call it G, and map p to G in the
    process's page table so the write can complete.
 
 4. On the next read from q, the CPU will fault, and NetBSD will find G
    from A in the VM map entry for q, and map q to G in the process's
    page table so the read can complete and return what was written
    through p.
 
 5. fork marks all the VM map entries copy-on-write, and updates the
    page table to make the pages nonwritable so the CPU will fault on
    writes to them.
 
 6. On the next write to virtual address p, the CPU will fault, and
    NetBSD will see that the entry in the VM map at p is copy-on-write,
    so it will create a new anonymous object, call it A', and allocate
    a new backing page, call it G'.  It will update the VM map entry at
    p for A' and the page table entry at p for G'.
 
 7. On the next read from virtual address q, the CPU will read from the
    old page G -- without faulting or consulting the kernel, because
    nothing changed the page table entry at q -- and return the stale
    value.
 
 Workarounds:
 
 (a) Use MAP_SHARED to create the mapping.
 
     Using MAP_SHARED means in step (6) NetBSD will not create a new
     anonymous object, so the anonymous object and backing page remain
     unchanged: both p and q will be mapped to A by the VM map, and to
     G by the page table.  But it also means the child will inherit
     them, and writes issued in the child will be reflected by reads
     issued in the parent -- which may pose security or synchronization
     issues.
 
 (b) Use minherit(MAP_INHERIT_NONE) or minherit(MAP_INHERIT_ZERO) on
     the writable mapping (as long as there is only one writable
     mapping).
 
     This way the writable mapping won't be cloned on fork, so it won't
     ever have the copy-on-write logic that breaks the coupling between
     the mappings.  Of course, this means the child _won't_ inherit the
     writable mapping, so writes by the child can't influence the
     readable mapping.
 
 (c) Don't fork; use posix_spawn.
 
 It would seem sensible to me for the coupled mappings to be copied
 wholesale into the child on write -- that is, the parent maintains
 coupled mappings of the same backing object, and the child maintains
 coupled mappings of a copy of the backing object.  This way, parent
 and child would get their own copy of the data without influencing the
 other, but each process still has coupled mappings like MAP_REMAPDUP
 originally created.
 
 Unfortunately, this will take a bit of work to implement correctly.
 
 As a stop-gap, we're considering making MAP_REMAPDUP have the side
 effect of minherit(MAP_INHERIT_NONE) on both mappings so that at least
 calling fork() in a process that uses JIT compilation with the code
 sample in the mremap(2) man page won't have the side effect of
 breaking the parent.
 


Home | Main Index | Thread Index | Old Index