tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [ANN] Lunatik -- NetBSD kernel scripting with Lua (GSoC project



On Mon Nov 08 2010 at 20:32:14 -0200, Lourival Vieira Neto wrote:
> >> > Right... so how do you restore the kernel to a valid state?
> >>
> >> Why wouldn't it be a valid state after a script crash? I didn't get
> >> that. Can you exemplify it?
> >
> > I *guess* what David means is that to perform decisions you need a
> > certain level of atomicity.  For example, just drawing something out of
> > a hat, if you want to decide which thread to schedule next, you need to
> > make sure the selected thread object exists over fetching the candidate
> > list and the actual scheduling.  For this you use a lock or a reference
> > counter or whatever.  So if your lua script crashes between fetching the
> > candidates and doing the actual scheduling, you need some way of releasing
> > the lock or decrementing the refcounter.  While you can of course push an
> > "error branch stack" into lua or write the interfaces to follow a strict
> > model where you commit state changes only at the last possible moment,
> > it is additional work and probably quite error-prone.
> >
> > Although, on the non-academic side of things, if your thread scheduler
> > crashes, you're kinda screwed anyway.
> >
> 
> Hi Antti,
> 
> Sorry for the delay. I agree: we need a certain level of atomicity. I
> think that level should be provided by the libraries that expose
> kernel internals to Lua (binding libraries), the kernel code that
> calls Lua and the Lunatik state's mutex. The functions of the binding
> libraries should not finish their execution with locks (or other
> resources) held. If it is really necessary, the binding libraries
> could provide functions to validate the state, after the Lua
> execution. However, I don't think that is a good idea to allow scripts
> to call functions that uses a lock without releasing it. Moreover, we
> can use the Lunatik state's mutex to perform the synchronization
> (between the kernel and the script code).
> 
> In your scheduling example, we can use a refcounter (as you said)
> stored in the Lua state and protected by the Lunatik state's mutex.
> Thus, if our Lua script crashes between fetching and scheduling, the
> caller can trace and treat that appropriately (e.g., restoring the
> refcounter, deleting that script function and calling a predefined
> function to perform the thread scheduling). Although, I think a better
> approach for that problem would be to provide a scheduling function
> that checks if the selected thread exists and fails if not (returning
> a error code for the script).

How would it check if a thread has exited?  You either need to keep some
log of object lifecycle (and when do you free that information, i.e. how
is it fundamentally different of anything else listed above?), give every
object some UUID to make sure the identifiers were not recycled so that
you're sure you get the same object when you relookup it, or register
some sort of callback from thread exit to the lua code.

For a scheduler an oops every now and then with scheduling the wrong
thread might not be a big deal, but if you for example mess up credentials
it's a bigger oops.

> In short, I think the functions provided for the scripts should be
> self-contained and all the locks should be managed by the kernel code.
> If functions of the binding libraries need to share and synchronize
> their execution state (e.g., a refcounter), they need to do so by
> storing the desired state in Lua.

Having some working code would be more convincing ;)


Home | Main Index | Thread Index | Old Index