Subject: Re: spam detection algorithms
To: None <tech-userlevel@netbsd.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: tech-userlevel
Date: 11/17/2002 21:31:43
> One of my recent emails got bounced by the recipient MTA with the
> message:

> ... while talking to sparkle-4.rodents.montreal.qc.ca.:
> >>> HELO snowdrop.l8s.co.uk
> <<< 501 HELO argument must be a valid domain name.

Yah, sparkle is my house mailserver, and that's the response it gives
when it gets a definite "this domain does not exist" from the DNS.  I
also find the conversation in my log.

> In this case I'm actually wondering about the validity of the check
> being used.

For what value of "validity"?  The SOA record for l8s.co.uk says
ns1.ukdnsservers.co.uk is the master nameserver for the domain; when I
query it directly asking for type ANY for snowdrop.l8s.co.uk, I get a
"this domain does not exist" error back.  It appears the test did what
it was designed to do.

> I could also (probably) persuade the HELO= line to have the hostname
> that DNS would return if asked to do a reverse lookup of the IP
> address of the interface the mail is being sent from.  However this
> would be the rather uninformative
> host62-6-97-249.in-addr.btopenworld.com (or some similar address).

host213-122-63-173.in-addr.btopenworld.com, in the case I found in my
logs.  Yes, I believe sendmail can be told what its name is for mail
purposes regardless of what the machine's hostname is set to.  It's
been a while since I cared about it, so I don't recall exactly how;
fuzzy memory says it's a Dj line in the .cf....

> In fact that string could be set to anything at all - so assuming
> that an address that cannot be looked up implies that the email is
> spam is only really requesting that the spammers modify their MTA.

Yes, it would be - if that were how I considered it.  I find it has
greater value at stopping spam sent through open relay spam and spam
sent through smarthosts of poorly run ISPs.

I do it partly in a pragmatic sense - it did help my spam load when I
turned it on, and based on other signs, I think a lot of the stuff it's
stopping is still spam even today - and partly in an idealistic sense,
in that I believe that HELOing with a nonexistent domain is an invalid
configuration and that refusing to talk to invalidly configured hosts
is reasonable and mostly good.  (There's argument on both points.)

> Of course the mail would have been bounced by the 'your system is
> probably an open relay' check anyway.

Actually, no; my mailserver makes no such check.  Apparently most of
the open relays left run into one of the other checks I do - rDNS and
HELO arguments are the biggest ones.

Anyway, this doesn't have much to do with tech-userlevel stuff.  I'll
try sending off-list and see if I can suss out a working communication
channel with David other than tech-userlevel.

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML	       mouse@rodents.montreal.qc.ca
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B