Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Another UVM wire page count underflow panic

> On Mar 15, 2019, at 2:48 PM, Robert Elz <kre%munnari.OZ.AU@localhost> wrote:

> Upon reflection, there is no hurry to fix this one, unlike the previous
> one which was screwing up the b5 tests - we (at least currently) have no
> tests which do anything as crazy as the code sequence to trigger this, so
> we can take our time and solve it properly.

Well, true, but I think a "fix before netbsd-9, will pull-ups to -8 and -7" is certainly a worthy goal.  After all, there is now a known sequence of calls that can cause a crash.

>  | POSIX's semantics could just as well be represented with a bit
>  | in a flags word,
> the one in the UVM map entry - yes, but that ons isn't really the
> issue.   What matters is the pmap count, and even in posix that needs
> to be a count, as multiple processes can independently lock the same
> (shared) region, and neither one's unwire affects the wiring done by
> another.
> Unless my assumptions about what is what here are incorrect (which they
> easily could be) the count that matters is the one which needs to remain
> a count.

The pmap layer doesn't really have a count.  It just has a "this PTE is wired" bit.  When the vm_map_entry that covers that PTE transitions from "not-wired" to "wired", the PTE gets the wired bit; when the vm_map_entry transitions from "wired" to "not-wired", the PTE loses the bit.  It's really as simple as that.  The pmap layer doesn't assume a count, it just depends on the upper layers keeping track of the state transitions, and updating the bottom layer accordingly.

The same goes for the backing pages -- you've probably noticed that the pages are either wired or unwired only at those rising and falling edges of vm_map_entry "wired-ness", but the pages, of course, have a count in them because there can exist multiple mappings for a page.

UVM history lesson time!  In some ways it's slightly silly to even have a wire_count in the vm_map_entry, because vm_map_entry's are not really shared ... they exist only in a single vm_map, and they correspond to one or more PTEs in the pmap's tables (one pmap per uvm_map)... but the count is in some ways an artifact of how uvm_vslock() / uvm_vsunlock() used to work ... they *used* to call uvm_map_pageable() (because the old Mach VM implementation used to call vm_map_pageable()) for doing physio and other things that necessitated wiring down user buffers so the kernel / devices could safely access them.  But that changed some 2 decades ago (again, I think this may have been my fault :-) for a couple of reasons:

	(1) munlock(2) and its semantics; you don't want it to unwire the buffers that a device is going to DMA into!

	(2) uvm_map_pageable() can fragment the map because of the entry clipping. the transient wirings used by uvm_vslock() and uvm_vsunlock() were changed to use uvm_fault_wire() and uvm_fault_unwire() directly, to specifically fiddle with the wired-ness of the underlying pages, while leaving the vm_map_entry's unchanged.

>  | I would suggest that the right way to fix this would be as follows:
> I think we ought to work out what the data structs should look like
> in the various possible cases - including mixed shm and m*() allocations,
> mappings, wiring, protection schemes - including where pages are
> mapped (either in more than once in one process, or in different
> processes) in both forms (a page that is a shm in one place is mmap'd
> in another, and wired by one of them, or both, or neither).

This should be relatively straight forward... I'll see if I can put together a couple of diagrams this weekend between various kid / household duties (and also recovering from this bout of late winter flu that's kept me out of my $DayJob office for a couple of days, bleh).  The wiring propagation between the various layers is really all about rising and falling edges, and once you understand the rules, it's pretty easy to work out what the data structures at each layer should look like for any given scenario.  In fact, the current code mostly follows those rules; the bugs, it seems, are really in defining what constitutes a rising or falling edge.

> Until we know what it will look like, I don't think trying to find
> minimal code changes from what we have now will be productive.
> First we need an audit of everything that affects or uses the UVM
> mappings to see just what is required.  The shm stuff is easy that
> way, as they have a very small visible footprint - even if they are
> an ugly design.
> Tomorrow (or much later today, or whatever you want to call it!)
> kre

-- thorpej

Home | Main Index | Thread Index | Old Index