Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: ATF t_mlock() babylon5 kernel panics



> On Mar 13, 2019, at 10:27 AM, Robert Elz <kre%munnari.OZ.AU@localhost> wrote:
> 
> Some progress.
> 
> #1 touching the buffer that malloc() returns (which is page aligned)
> made no difference - it is likely that the malloc llibrary would have
> done that in any case (the malloc is for just one page, so either it
> is resident (or paged) or it is a ZFoD page, and most likely not the
> latter - certainly not after this small mod.)
> 
> #2 changing the mlock_clip to avoid the sequence
> 
> 	mlock(buf, 0);
> 	munlock(buf, 0);
> 
> avoided all of the issues and problems, and the test passed.
> 
> What locking 0 bytes might achieve

Finding bugs :-)

> I have no idea, but that is
> what the test does:
> 
>        buf = malloc(page);
> /*...*/
>        for (size_t i = page; i >= 1; i = i - 1024) { 
>                err1 = mlock(buf, page - i);
>                if (err1 != 0)
>                        fprintf(stderr, "mlock_clip: page=%ld i=%zu,"
>                            " mlock(%p, %ld): %s\n", page, i, buf, page - i,
>                            strerror(errno));
>                err2 = munlock(buf, page - i);
> /*...*/
> 
> That first munlock() is where the panic (with the extra KASSERTs added)
> occurs.

Ok, well, I see some problematic code in sys_mlock() and sys_munlock(), but I don't think it's affecting this case (and it may in fact have the effect of making the test pass if a non-page-aligned buffer is passed):

        pageoff = (addr & PAGE_MASK);
        addr -= pageoff;
        size += pageoff;
        size = (vsize_t)round_page(size);

If you pass in (0x1004, 0), for example, you end up with:

	pageoff = 0x4;
	addr = 0x1000;
	size = 0x4 ROUND UP-> 0x1000;

It's not clear that's the actual intent ... seems like if you pass 0, you should get 0, regardless of alignment... but let's fix that problem LATER.

Since the logic for trunc / round is the same in both mlock and munlock, let's turn our attention to any logical differences in uvm_map_pageable().


If first looks up the entry that covers the range, and assigns the same value to "entry" and "start_entry".

The code paths for the "pageable" and "not pageable" are split, so we'll focus on "pageable" since that's the one that's panic'ing.

Assuming the region that's passed to munlock() is page-aligned (I'll use the example 0x1000 with size 0), we should see uvm_map_pageable() called with:

	start = 0x1000;
	end = 0x1000;

The first thing it does is call UVM_MAP_CLIP_START(), which ensures that "entry"'s start is the start address, splitting the entry as necessary (inserting the new split-off entry ahead of "entry").

Next it does a pass over all the entries in the range to ensure that every entry is actually wired and that there are no unmapped regions in the range.

The check it uses is:

	- If the current entry is not wired
	OR
	- If the current entry ends before the end of the range
	AND
		- This is the last entry
		OR
		- The next entry starts after the end of the current entry (hole)

	...then return EINVAL

The logic there seems correct, and you're not seeing EINVAL.

Now, AFTER that first loop, it does this:

	entry = start_entry;

...because it wants to pass over the range again.  This should be safe, because UVM_MAP_CLIP_START() is *supposed* to preserve the value of "entry".

The *second* loop uses the same termination logic as the first, and the is-entry-wired test is equivalent (look at the macro's expansion).

It's obvious from the panic that the bottom layer of this cake is freaking out because it's being asked to unwire something that's not actually wired, but I don't see how this can happen for a page-aligned zero-length region.

I would suggest instrumenting-with-printf the "new_pageable" case of uvm_map_pageable()

> The err1 = and err2 = are recent additions so that printf (and a
> similar one after the munlock) can indicate if anything goes wrong.
> 
> The mlock(buf, 0) (first time around the loop when page == i)
> call did not fail (err1 == 0).   Or at least it seemed to be OK
> (there was no output, but there might have been some coming when
> the munlock() caused the panic i I will try again (much much
> later today) with a sleep between mlock() and munlock() so any
> output has time to drain.
> 
> Anyway, this might give someone enough of a clue as to what is happening
> that that someone (who understands UVM and the pmap code) can fix things.
> 
> There's no indication in the mlock() man page, or in the posix spec,
> that len == 0 is anything but a valid call (one would expect it to be
> a no-op though, rather than scrambling the kernel data structs).
> 
> With that loop modified so i starts at page - 1024 everything is fine,
> 
> kre
> 

-- thorpej



Home | Main Index | Thread Index | Old Index