tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Serious WAPL performance problems



Hello Brian,

Brian Buhrow <buhrow%nfbcal.org@localhost> wrote:
>       Hello.  I think you two are talking past each other.  While it's
> true that having a lock name isn't necessarily enough information to
> diagnose a problem, it's a lot better than having nothing.  I've worked
> on systems where all you could get was an address of a lock, which was
> different on every system, and as a result, it was nearly impossible to
> diagnose issues in the field at all.  With lock names, you can search
> through the source code and find where  those locks are taken, and,
> potentially, where they're released.  Recently I found a problem with the
> ahc(4) driver where it issues a command to a controller and goes to sleep
> waiting for a response.  If the controller goes out to lunch and never
> answers the call, the driver gets stuck forever in that spot.  With the
> lock name, I was quickly able to determine what was wrong and,
> potentially, fix the issue. If I'd only had an address, I'd still be
> scratching my head about the issue.  I agree with rmind that names aren't
> always useful, but I'd sure rather have them than not.  And, the more
> descriptive and unique they are in the source tree, the more useful they
> are. -Brian

Well, it is not nothing.  If the case is relatively simple/obvious that
you can figure out the problem just by having the lock name - then you
can also figure it out from the backtrace (which is more informative).
Now that we have crash(8) it should not be harder than invoking ps(1).

Otherwise, we need more information than the names.  The vast majority
of the "tstile" problem reports we get are related with vnodes (merely
because our VFS subsystem still has very unfortunate locking and thus
various issues, although many of them were fixed during the netbsd-5
cycle).  Sticking the label "some-vnode" will not help to solve these
problems, as they are more sophisticated.

Hence my point that the addition of lock names is more cosmetic than
practical.  It could be useful to add naming for rwlock(9), which may
often be used for a heavy-weight serialisation (taking long time).  For
mutex it would be a little benefit with a cost of extra space and lots
of code churn (plus small things like some structures having kmutex_t
being aligned & padded to use up the cache-lines for efficiency).  Much
more practical approach would be to improve our tools, like crash(8)
and LOCKDEBUG facility.

-- 
Mindaugas


Home | Main Index | Thread Index | Old Index