tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kernel condvars: how to use?

	hello.  I don't consider myself an expert with this stuff, but I've
spent quite a bit of time converting kernel code from using spl()/splx() in
5.2 to using mutexes and convars with some success.  Here are some notes
that may be helpful in your work:

1.  You must initialize mutexes, as Taylor noted, and pick the kind of
mutex you want and the level at which you want it to run.  Mutexes that use
spin locks can't be used in interrupt context.
If you're sharing your IPL with other drivers that may want to hold mutexes
at the same level, particularly ones that might be doing some operation
like copying data into or out of user space, it's good to structure your
code in such a way that you won't try to run when they do.

2.  Initialize your convar with cv_init().

3.  Many times, you can replace s = spl(ipl) with mutex_enter(my_mutex) and
splx(s) with mutex_exit(my_mutex).  It's generally considered to be a very
bad idea to sleep while holding a mutex, much as it is while executing code
inside an spl'd stanza.

4.  If you run into lock contention when debugging your code, pay careful
attention to who holds the lock at the time of the panic.  I found times
when I was locking against myself in non-obvious manners.  

5.  Read the manual pages for mutex(9), convar(9) and rwlock(9), and, when
done, read them again.  Then, look at working examples in the code.
Eventually, it will click in your head and begin to make sense.

Hope these imprecise notes are somewhat helpful.


On Dec 7,  6:24pm, Mouse wrote:
} Subject: kernel condvars: how to use?
} I'm trying to write some kernel code, interlocking between an interrupt
} (in my case, a callout()-called function) and a driver read() function.
} I'm using 5.2, so if this is because of bugs that have been fixed since
} then, that's useful information.  (And anyone who isn't interested
} because I'm on such an old version need read no further.)
} I noted that the interfaces I have historically used for this - spl*(),
} sleep(), and wakeup() - are documented as deprecated in favour of
} condvar(9), mutex(9), and rwlock(9).  So I wrote some code using a
} condvar and a mutex, and the system promptly deadlocked.  I got into
} ddb, which told me it was inside intr_biglock_wrapper():
} db{0}> tr
} breakpoint() at netbsd:breakpoint+0x5
} comintr() at netbsd:comintr+0x53a
} Xintr_ioapic_edge7() at netbsd:Xintr_ioapic_edge7+0xeb
} --- interrupt ---
} x86_pause() at netbsd:x86_pause
} intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x16
} Xintr_ioapic_level5() at netbsd:Xintr_ioapic_level5+0xf3
} --- interrupt ---
} x86_pause() at netbsd:x86_pause+0x2
} cdev_poll() at netbsd:cdev_poll+0x6d
} VOP_POLL() at netbsd:VOP_POLL+0x5e
} pollcommon() at netbsd:pollcommon+0x265
} sys_poll() at netbsd:sys_poll+0x5b
} syscall() at netbsd:syscall+0xb9
} db{0}> 
} On reflection, I think I know why.  Userland's syscall handler took the
} mutex in preparation for cv_wait_sig(), the interrupt happens, my code
} is called (verified with a printf), and it tries to take the same mutex
} so it can cv_broadcast().  Of course, the mutex is held and, because
} it's held by code which can't run until the interrupt handler exits,
} will never be released.  Then, when a hardware interrupt hit it found
} the biglock held....
} Clearly, I'm doing something wrong.  But I can't see what.  I can't see
} how to use the condvar/mutex primitives without provoking the above
} failure mode.  And they appear to still be the current recommended way,
} based on what I could find, so I'm presumably just missing something.
} Any hints what?
} I can of course provide more information if it would help, but I'm not
} sure what would be useful to mention here.
} /~\ The ASCII				  Mouse
} \ / Ribbon Campaign
}  X  Against HTML
} / \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B
>-- End of excerpt from Mouse

Home | Main Index | Thread Index | Old Index