Subject: Re: pkgsrc SpamAssassin Performance miserable under load
To: None <tech-pkg@NetBSD.org>
From: Klaus Heinz <k.heinz.jan.vier@onlinehome.de>
List: tech-pkg
Date: 01/17/2004 22:22:00
Chuck Yerkes wrote:

> It handles around 500,000 inbound messages/day.  70% on one
> machine.

There are people on the spamassassin-talk mailing list managing large
sites. Maybe they can provide valuable input.

> Runs ok for a while, then the CPUs peg.  LA of 35 (at which point
> sendmail refuses mail).

Did you try option -m for spamd? This limits the number of children
spamd creates in parallel and will be turned on in SA 2.70 with a
value of 5.
 
You mention using a milter. Do you use a global Bayes-Database and how
big is it?
I think I remember reading about performance problems when the database
grew too big. Maybe a periodical 'sa-learn --force-expire' can help you
to limit growth of your Bayes-DB. See also the man page for sa-learn
about options related to expiration (section EXPIRATION).

SA 2.61 reduced the memory consumption of the bayes token expiration
(bug 2805), so it may be useful to use 2.61 (ie the current version in
pkgsrc) instead of 2,60.

If spamd is running on the same machine as your spamc processes you
can reduce overhead by not using TCP but UNIX domain sockets (see
the man pages for spamc and spamd).

> I'm looking into the details of how spamassassin is built (I've
> told it not to use ssl for connections already) and how perl is
> built.

SSL is built into the spamc binary but will be used only if it is
turned on with appropriate options for spamc and spamd. I doubt the
optional support for SSL is causing problems.


HTH,
     Klaus Heinz