pkgsrc-Changes-HG archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[pkgsrc/trunk]: pkgsrc Update spamprobe to 1.0a, patch sent via IRC by the ma...



details:   https://anonhg.NetBSD.org/pkgsrc/rev/2bee52af1b6e
branches:  trunk
changeset: 483746:2bee52af1b6e
user:      hubertf <hubertf%pkgsrc.org@localhost>
date:      Thu Nov 18 12:46:53 2004 +0000

description:
Update spamprobe to 1.0a, patch sent via IRC by the maintainer.

Changes:
        * MimeLineReader.cc: 1.0 branch - fixed MBX record header regex
        * spamprobe.cc (main): Added exec and exec-shared commands.
          (import_words): modified import command to allow negative values
          to be specified in the import file.
        * Applied patches for configure.in and aclocal.m4 contributed by
          Siggy Brentrup for debian compatibility.
        * FrequencyDBImpl_pbl.cc: Invokes new WordData methods to allow
          storing data in big endian format.
        * WordData.h: Added optional support for storing counts/flags
          in big endian order for data portability.
        * MimeLineReader.cc (readMBXFileHeader): UW IMAP MBX file format
          is now auto detected from the first line of the mailbox file.
        * spamprobe.cc (process_extended_options): Removed -o imap-mbx
          option.
        * spamprobe.cc (process_extended_options): Added -o imap-mbx
          option to process files as WU-IMAP MBX files rather than mbox
          files.
        * MimeLineReader.cc (readLine): Added support for WU-IMAP MBX file
          format.
        * spamprobe.cc (process_stream): Added -o tokenized option
          to allow people to use an external tokenizer with spamprobe.
        * SpamFilter.cc (scoreToken): Reduced sorting overhead by
          pre-computing and integer sort value with sorting priorities
          reflected in the value.  This eliminates several calculations
          inside of the sort routine.
        * SpamFilter.cc (computeRatio): Capped ratios in calculations to
          within MIN_PROB and MAX_PROB.  Widened that range.  This avoids
          problems with div/0 and makes it easier to sort terms.
        * spamprobe.cc (dump_words): dump command can now optionally
          accept a regular expression as an argument and will only dump
          terms matching the regular expression.
          (purge_terms): Added purge-terms command to purge from the
          database all terms matching a regular expression.
        * spamprobe.cc (main): Fixed bug in command line processing.
          Thanks to Jem for bug report.
        * spamprobe.cc (train_on_message): Code simplified.  Eliminated
          redundant recalculation of scores.
          (train_on_message): Timestamps are now longer updated by
          train-spam and train-good commands.  They are still updated by
          train command.
          (main): Fixed assertion if -P option is specified in a read only
          operation.
        * spamprobe.cc (main): Added -C command line option to allow users
          to specify their own min word count.
        * SpamFilter.cc (SpamFilter): Set default minimum word count back
          to 5 (was 3).
        * spamprobe.cc (process_extended_options): Removed "alt-score"
          from -o options list because it distributes scores poorly.  New
          formula achieves the same end with better accuracy.  Added
          "orig-score" option to allow people to continue using the old
          formula.  Added "honor-xstatus-header" option for people whose
          mail server uses X-Status: rather than Status: for the deleted
          flag.
          (main): Added -l command line option to allow people to set
          their own spam threshold if they don't like the default value.
        * SpamFilter.cc (scoreMessage): Added a new scoring formula based
          on Paul's but taking the nth root of spam and good probabilities
          to produce more evenly distributed scores.  Lowered the spam
          threshold to 0.6 to keep accuracy about the same as the original
          formula.  Highest score seen for a ham so far in tests is 0.44
          so 0.6 seems safe.  Made the new formula the default instead of
          Paul's.

diffstat:

 doc/CHANGES             |  3 ++-
 mail/spamprobe/Makefile |  5 ++---
 mail/spamprobe/distinfo |  6 +++---
 3 files changed, 7 insertions(+), 7 deletions(-)

diffs (38 lines):

diff -r 967cfcb22501 -r 2bee52af1b6e doc/CHANGES
--- a/doc/CHANGES       Thu Nov 18 12:33:01 2004 +0000
+++ b/doc/CHANGES       Thu Nov 18 12:46:53 2004 +0000
@@ -1,4 +1,4 @@
-$NetBSD: CHANGES,v 1.7895 2004/11/18 12:33:01 markd Exp $
+$NetBSD: CHANGES,v 1.7896 2004/11/18 12:46:53 hubertf Exp $
 
 Changes to the packages collection and infrastructure in 2004:
 
@@ -5323,3 +5323,4 @@
        Updated wv2 to 0.2.2 [shannonjr 2004-11-18]
        Updated guile to 1.6.5 [wiz 2004-11-18]
        Updated R to 2.0.1 [markd 2004-11-18]
+       Updated spamprobe to 1.0a [hubertf 2004-11-18]
diff -r 967cfcb22501 -r 2bee52af1b6e mail/spamprobe/Makefile
--- a/mail/spamprobe/Makefile   Thu Nov 18 12:33:01 2004 +0000
+++ b/mail/spamprobe/Makefile   Thu Nov 18 12:46:53 2004 +0000
@@ -1,7 +1,6 @@
-# $NetBSD: Makefile,v 1.10 2004/10/03 00:12:54 tv Exp $
+# $NetBSD: Makefile,v 1.11 2004/11/18 12:46:53 hubertf Exp $
 
-DISTNAME=      spamprobe-0.9h
-PKGREVISION=   1
+DISTNAME=      spamprobe-1.0a
 CATEGORIES=    mail
 MASTER_SITES=  ${MASTER_SITE_SOURCEFORGE:=spamprobe/}
 
diff -r 967cfcb22501 -r 2bee52af1b6e mail/spamprobe/distinfo
--- a/mail/spamprobe/distinfo   Thu Nov 18 12:33:01 2004 +0000
+++ b/mail/spamprobe/distinfo   Thu Nov 18 12:46:53 2004 +0000
@@ -1,4 +1,4 @@
-$NetBSD: distinfo,v 1.5 2004/02/03 20:49:34 hubertf Exp $
+$NetBSD: distinfo,v 1.6 2004/11/18 12:46:53 hubertf Exp $
 
-SHA1 (spamprobe-0.9h.tar.gz) = 34a4d5dc622570cc109a92f1a4b2222d4d3b08ff
-Size (spamprobe-0.9h.tar.gz) = 161164 bytes
+SHA1 (spamprobe-1.0a.tar.gz) = 4077b4b5280b29fa08b31b3131ee5cf005faefd7
+Size (spamprobe-1.0a.tar.gz) = 165747 bytes



Home | Main Index | Thread Index | Old Index