Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Xen balloon driver rewrite

On Thu, 7 Apr 2011 19:10:29 +0530, "Cherry G. Mathew" <cherry.g.mathew%gmail.com@localhost> wrote:
It's still at paper level. ATM, I'm still not sure if workqueue(9) is necessary: at a locking level, both will check the same "target" variable. The only difference is that workqueue(9) will spawn a thread context when
necessary, compared to a thread that stays "sleepy" most of the time
(eventually with a timeout, but it rapidly gets amortized).

If I understand the design correctly, this effectively uses a queue to
serialise inflate/deflate requests. Is this really required ? For eg:
if a new alloc request, say: "inflate by 256M" is followed by "deflate
by 256M", would this mean that both the inflate and the deflate would
occur in series (which would make the domU uvm thrash swap
unnecessarily) ?

Nope; in my current repo, the thread (or workqueue...) will just loop, and recheck the target vs current reservation difference. There is only one target, shared between workers.

If you ask for inflate then a deflate of 256MiB, or the other way around, you will effectively spawn two workers. Both will work towards the same target though, and in this case, will rapidly exit.

I'm not sure that a single kernel thread context is a lot of overhead.
Part of my design motivations were based around being gentle with the
VM system; ie; minimise the rate of "spikes" in mem alloc/de-alloc.
I'm a little concerned about if the workqueue will respond to
rapid-fire balloon change requests with a workqueue overflow. Have you
checked for this case ?

(on paper) the workqueue is serialized. It is just used as a "create a thread as the target value is now different from current reservation", and will only exit on error, or when current reservation reaches target. The only difference between current code is that instead of having a thread going back to idle, it simply returns.

With the "idle thread design", I have to handle two different situations:
- one where current reseravtion reaches target: trivial
- one where current reservation failed to reach target, due to memory exhaustion. Current code uses a feedback function, and updates ``target'' to reflect this. workqueue(9) makes it easier to handle, as I can log error, return from thread, and leave the target value without having to feedback a change in ``target''.

The most obvious option that comes to mind is to use a pool. The alloc
is quite small sized, so it shouldn't be that much of an overhead.
OTOH, if that small a size of allocation is failing, memory pressure
is pretty huge, so I think KM_NOSLEEP would be more apt design, and
the driver should refuse to inflate.

It's hard to find the right solution for balloon inflation. I inadvertently triggered oom_killer with Linux domUs multiple times, and it's a real pain when it starts killing the wrong processes. That's the purpose of the balloon.mem-min value, so domU can refuse to balloon below a certain threshold.

Jean-Yves Migeon

Home | Main Index | Thread Index | Old Index