Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Xen balloon driver rewrite



On 7 April 2011 20:05, Jean-Yves Migeon <jeanyves.migeon%free.fr@localhost> 
wrote:
> On Thu, 7 Apr 2011 19:10:29 +0530, "Cherry G. Mathew"
> <cherry.g.mathew%gmail.com@localhost> wrote:
>>>
>>> It's still at paper level. ATM, I'm still not sure if workqueue(9) is
>>> necessary: at a locking level, both will check the same "target"
>>> variable.
>>> The only difference is that workqueue(9) will spawn a thread context when
>>> necessary, compared to a thread that stays "sleepy" most of the time
>>> (eventually with a timeout, but it rapidly gets amortized).
>>
>> If I understand the design correctly, this effectively uses a queue to
>> serialise inflate/deflate requests. Is this really required ? For eg:
>> if a new alloc request, say: "inflate by 256M" is followed by "deflate
>> by 256M", would this mean that both the inflate and the deflate would
>> occur in series (which would make the domU uvm thrash swap
>> unnecessarily) ?
>
> Nope; in my current repo, the thread (or workqueue...) will just loop, and
> recheck the target vs current reservation difference. There is only one
> target, shared between workers.
>

Ok that makes sense.

> If you ask for inflate then a deflate of 256MiB, or the other way around,
> you will effectively spawn two workers. Both will work towards the same
> target though, and in this case, will rapidly exit.
>

I'm curious about the effect of lock contention in the case of many
outstanding threads reading/writing to the same shared variable. Also
the load on the scheduler / thread manager for multiple outstanding
requests.

Can't think of any other objections.

>> I'm not sure that a single kernel thread context is a lot of overhead.
>> Part of my design motivations were based around being gentle with the
>> VM system; ie; minimise the rate of "spikes" in mem alloc/de-alloc.
>> I'm a little concerned about if the workqueue will respond to
>> rapid-fire balloon change requests with a workqueue overflow. Have you
>> checked for this case ?
>
> (on paper) the workqueue is serialized. It is just used as a "create a
> thread as the target value is now different from current reservation", and
> will only exit on error, or when current reservation reaches target. The
> only difference between current code is that instead of having a thread
> going back to idle, it simply returns.
>
> With the "idle thread design", I have to handle two different situations:
> - one where current reseravtion reaches target: trivial
> - one where current reservation failed to reach target, due to memory
> exhaustion. Current code uses a feedback function, and updates ``target'' to
> reflect this. workqueue(9) makes it easier to handle, as I can log error,
> return from thread, and leave the target value without having to feedback a
> change in ``target''.
>

In the feedback case, the userland script/tool within the domU, and
the dom0 (via xenstore) can see what's going on. Wouldn't this be
better than to leave both uninformed and under the impression that the
balloon succeeded ?

>> The most obvious option that comes to mind is to use a pool. The alloc
>> is quite small sized, so it shouldn't be that much of an overhead.
>> OTOH, if that small a size of allocation is failing, memory pressure
>> is pretty huge, so I think KM_NOSLEEP would be more apt design, and
>> the driver should refuse to inflate.
>
> It's hard to find the right solution for balloon inflation. I inadvertently
> triggered oom_killer with Linux domUs multiple times, and it's a real pain
> when it starts killing the wrong processes. That's the purpose of the
> balloon.mem-min value, so domU can refuse to balloon below a certain
> threshold.
>

Yes, but balloon.mem-min is a pretty arbitrary/rule-of-thumb value,
right ? Whereas the scenario above would be a realworld situation
where we got feedback that a high mem-pressure situation has been
reached ?

Cheers,

-- 
~Cherry


Home | Main Index | Thread Index | Old Index