On 11/11/10 18:07, Mindaugas Rasiukevicius wrote:
Hello, Similar questions were raised a few times.. let's go through this again.
Sorry if I am rehashing old stuff then...
Johnny Billquist<bqt%softjar.se@localhost> wrote:With mutex_spin, you instead store the original spl at the first mutex_spin_enter, and later calls to mutex_spin_enter can only possibly raise the ipl further. At mutex_spin_exit, we do not lower the spl again, until the final mutex_spin_exit, which resets the spl to the value as before any mutex was held.This is correct.The cause a slightly different behaviour, as the spl will continue to possibly be very high, even though you are only holding a low ipl mutex. While it obviously don't cause a system to fail, it can introduce delays which might not be neccesary, and could in theory cause interrupts to be dropped that needn't be.In theory, it might. But see below.Is this a conscious design? Do we not expect/enforce mutexes to be released in the reverse order they were acquired?Very conscious, made about 4 years ago. There are very few (would say a tiny fraction, in all tree) where we have the following locking cases: mutex_spin_exit(a); <- e.g. where a has IPL_HIGH <some code> <- some code still runs at IPL_HIGH mutex_spin_exit(b); <- e.g. where b has IPL_VM One such case is in tty subsystem, which needs proper locking (and perhaps generally revamped.. anyone?). Generally, such cases should be fixed.
Hum? So that was introduced with the new locking code then? Because back when we used splraise/splx, the above would not have worked.
Also, AFAIK effect was was actually measured during newlock2 branch development. Supporting such micro-optimisation would mean quite a bit of complexity in critical mutex(9) interface with practically no gain.
My initial reaction was that each mutex should have just stored what ipl it came from, and just restored it back to that when the mutex was released. That would also allow us to skip the mutex count in cpu_info, as well as the global saved ipl in cpu_info, but then I realized that this solution would break if people actually wrote code like
lock(a) lock(b) release(a) release(b)or if code actually relied on the ipl staying at a high level, even though you don't have any mutex that motivates this anymore. Otherwise, I would think that each mutex keeping the old ipl level for restoring at release is both prettier and shorter, and is safer. And since a mutex can't be held by multiple users at the same time, there is no risk that the old ipl would become clobbered either. (Although right now I'm trying to figure out why I seem to lock a spin mutex twice when the system is coming up, and it might be in the MI bits of code...)
Also, there are a few places in the code in the kernel, where atomic_cas is called explicitly. Wouldn't it be better if such code were abstracted out, so we didn't depend on specific cpu instructions in the MI code?atomic_cas_*() routines are mandatory part of atomic_ops(3/9) interface. I believe almost any modern operating system should have it these days. We model algorithms using them. I believe MI code, design-wise, should look to the future, rather than the past. :)
Hmm. Hard to argue about this. While I think it's nice if we try to keep the kernel agnostic, the user api is not something I'm arguing about changing. But it would be nice if we could leave it possible to do things in different ways when it's not really the effect of a CAS that is needed, but we'd still like to keep it in a way that allowed the compiled code to be as nice and efficient as possible.
Johnny