python simplicity
Ok, needed to return only the numerical portion from a string like ‘h00009492a802.ne.client2.attbi.com’. Pretty simple to do in Python:
def onlyNum(s):
num = [n for n in s if n.isdigit()]
return ''.join(num)
Voila. And the result?
>>> s = 'h00009492a802.ne.client2.attbi.com' >>> onlyNum(s) '000094928022'
The idea was to check the amount of numerical digits in a sending machines domain name in a mail header to help determine if it is SPAM. Since mostly spam originates from addresses like the one above, to me it would seem reasonable to say:
nums_from_header = onlyNum(header_hostname)
if len(nums_from_header) > 4:
SPAM = True
else:
SPAM = False
It’s not perfect (and I wound up not using it in favor of a DNS blacklist), but Python makes it pretty easy to conceptualize.
(I’m glad that I can evangelize Python to all of you who come here to read stuff about Curtis. I can just see you all shaking your heads at this sort of thing. Makes me smile
cw
May 6th, 2004 at 12:56 pm
Curtis, Python, Indians. Whatever.
But while we’re here:
“The idea was to check the amount of numerical digits in a sending machines domain name in a mail header to help determine if it is SPAM”
How does that help? Is it just an extension of the idea that a spam server is named in such a manner?
Also, what is “DNS blacklist?”
May 7th, 2004 at 10:36 pm
“Is it just an extension of the idea that a spam server is named in such a manner?”
You’ve got it. Most legit email does not originate from hosts with lots of digits in the hostname. Conversely, most DSL/cable/dialup hosts have a lot of digits in their hostnames. It was my thought that by filtering email out from hosts with lots of digits in their host names would curb some spam.
A DNS blacklist is a list that is contains the host names of known spammers, mail servers that have been compromised and used as spam relays, and (some lists) dynamically assigned addresses. You use a list like this that is maintained by a 3rd party and configure your mail server to reject mail from any host on the list.
May 10th, 2004 at 4:22 pm
Nerd! Nerd! Nerd! Christian, this might be the all time best blog I’ve ever read. The only thing more nerdish would be if I started raving about the brown thrasher I saw today.