Subject: Re: db->seq never gets to end
To: None <netbsd-help@NetBSD.org>
From: Jeremy C. Reed <reed@reedmedia.net>
List: netbsd-help
Date: 08/29/2007 18:20:14
On Tue, 14 Aug 2007, Alistair Crooks wrote:

> On Thu, Aug 09, 2007 at 10:57:39PM -0500, Jeremy C. Reed wrote:
> > Hopefully this list is okay ...
> > 
> > I am using:
> > 
> >         for (r = db->seq(db, &dbk, &dbd, R_FIRST); !r;
> >             r = db->seq(db, &dbk, &dbd, R_NEXT)) {
> > 
> > But sometimes it never ends.
> > 
> > I added a counter. And it would get to hundreds of thousands and data 
> > would repeat. I only have less than 1500 keys. I'd also get dbd.size that 
> > would be hundreds of thousands of bytes (but should only be 20 bytes).
> > 
> > Is there any way to ask the hash(3) how many elements it has?
> 
> Maybe it's just me, but I didn't even realise that you could
> do a sequential scan of a hashed database. dbopen(3) says:
> 
>               R_LAST  and  R_PREV  are  available  only  for  the
>               DB_BTREE and DB_RECNO access methods  because  they
>               each  imply  that  the  keys have an inherent order
>               which does not change.
> 
> Given that statement, I don't see how R_FIRST and R_NEXT are
> any different, but I'm getting old and confused.
> 
> Was this software meant to use a different version of Berkeley
> db?
> 
> Regards,
> Alistair


I never saw any comment to your first thought.

The "spamd" software was written to use Berkeley DB as provided in the 
base install of OpenBSD. I assume that it is near same as ours.

My spamd databases continue to get corrupted and I have many log entries 
like:

Aug 29 12:26:46 ca spamd[11901]: can't delete 74.220.163.20 
mail.pwhosting.com <> <Anne@lists.reedmedia.net> from spamd db (No buffer 
space available)

Aug 29 12:25:43 ca spamd[11901]: queueing deletion of et.paqueta.com.br <> <Kief
fer@lists.reR!UF.PUF2uUF^Q

(notice missing end >)

Aug 29 17:04:31 c-0500 spamd[28330]: can't delete Rhein-Neckar.DE> 
<reed@reedmedia.net>5^TUF^DSUF^UhUF^C from spamd db (No buffer space 
available)


Should a Berkeley DB key and/or data be sanitized to replace special 
characters before using?

I have been using spamd for over a year. Previously spamd used BTREE and 
since January been using HASH. Their CVS log says: "Using DB_BTREE for 
spamd is wrong, order is never required and the rebalancing really slags 
big databases."

Note I have the problems on two different 3.1 servers (one i386 and the 
other XEN3_DOMU on i386).

The side effects are that it often fails on "bogus entry in spamd 
database" so my pf spamd-white table is not updated; new GREY entries are 
created when WHITE entry for the IP already exists; pf table is sometimes 
replaced with only a small amount of the WHITE entries; and sometimes 
spamd loops through database entries thinking it has hundreds of thousands 
of entries using up all memory (I stopped that by forcing a max limit to 
amount of entries it can loop through).

  Jeremy C. Reed