Subject: Re: PostgreSQL
To: Michael Richardson <mcr@sandelman.ottawa.on.ca>
From: Garrett D'Amore <garrett_damore@tadpole.com>
List: tech-net
Date: 02/03/2006 07:49:01
Michael Richardson wrote:
>
> I think it comes down to:    THREADING IS HARD
>
> WAY too many people try to write threaded code and fail.
>
> As a example, look at gnokii. It used to run fine on NetBSD, then it
> grew threading, and stopped working.
>
> Why? So that morons could write:
>      while(1) {
>           if(dtr is true) break;
>      }
>
> (I haven't tried to run it on NetBSD in three years. I suspect it has
> gotten better)
>
> Writing good multithreaded user-space code is really *HARDER* than
> writing kernel code.  If you think about this, you realize why
> "professional" application code is often so crappy, huge, bloated,
> buggy.
I disagree with the statement that it is "harder".  This may be true in
the NetBSD kernel, but that is only because we have the giant lock and
no true kernel threads.   When we grow those features, the statement
will be patently false.
Work in the Solaris kernel for a while, and you discover things like
priority inversion, of interrupt handlers, reentrancy, etc.  I encourage
anyone with your opinion to take a close look at the Solaris STREAMs
framework, and then tell me if you still agree.
Kernel work is also more challenging in some regards because of
interrupts, and interrupt priorities.  And the fact that you have
high-priority contexts where you cannot sleep (e.g. just alloc a chunk
of memory).
And this is without even exploring the interesting problems of virtual
memory, caching, DMA, polling devices, etc.
The problem with userland code is that there are a bunch of people that
just don't really understand concurrent programming, or good design, and
many of those people are hacking away at userland programs.  For
whatever reason, the user/kernel boundary seems to act as a natural
filter, so the class of developers working below that boundary seem a
bit more clueful.  There are exceptions, but I think the generalization
holds.
>
> Leaning to write co-routines and select/poll interfaces is takes a bit
> more brain power at first than writing the above busy-loop, but is SO
> MUCH easier to debug that the effort pays off.
select/poll has upsides and downsides.  upside is typically better
single CPU performance, and *sometimes* easier debug.  downside is no
SMP scalability.
Its a *lot* easier to write good, correct programs that work well single
threaded, and then expand that to MT code (particularly when you have
little or no shared state) than trying to get it all right with select/poll.
The problems with MT programs really have a lot more to do with shared
state, I think.  Folks who don't understand locking considerations (how
to avoid deadlock, priority inversion, etc.) struggle with it.
>
> I think that the postgresql people know this.
The postgres site has a good explanation of this in their mail
archives.  Their considerations:
1) improved robustness of multi-process concurrency vs. MT (one thread
doesn't take out entire app)
2) heap fragmentation considerations with big MT programs
3) portability and robustness of threads on all of their target platforms
4) not all threading platforms are equally efficient -- compare e.g. MS
Windows and Solaris
5) ultimately, threading for them does very little for DB performance --
most of the benefit is in connection setup, and the belief is that this
should not be considered in the hot-path of a DB application --
connection caching and pooling is seen as a better way to optimize this
All that said, there is some thought that limited use of MT could
improve e.g. parallel sorts, etc.  The idea here would be to still use
process per connection, but each process could use multiple threads to
improve SMP scalability with big queries.  Nobody has stepped up to the
table to do the work though -- its "non-trivial".
    -- Garrett
>
>
> -- 
> Garrett D'Amore, Principal Software Engineer
> Tadpole Computer / Computing Technologies Division,
> General Dynamics C4 Systems
> http://www.tadpolecomputer.com/
> Phone: 951 325-2134  Fax: 951 325-2191
>