You are currently browsing the archives for the Administration category.

I’m Teaching Spam Assassin About the Spam in My Maildirs

Posted 1 year, 6 months ago at 9:07 pm by Eric. 0 comments

Since SpamAssassin (SA) was letting through more spam than I would like, I began looking for ways to improve its classifications.  SA has bayesian filtering, a great technique for classifying spam, once you train it.  SA’s bayesian filter needs to scan about 200 spam, and 200 ham messages before it’s effective.

I wanted a way to train SA’s bayesian filter, without having to do a lot of work on my part.  SA provides a utility, called sa-learn, that when fed with spam or ham, will train its bayesian filter.  Python includes libraries for dealing with various mailbox formats, including Maildir, which is used by my mail server.  I could see all the pieces, I just had to work out the details.  So here’s what I did:

I created a folder named “Learn as Spam”, and one named “Learn as Ham”.  I already had a folder named “Trash”, and another named “Junk”.  The “Junk” folder is where SA currently put mail it thinks is spam.

Next I wrote a script, that runs SA’s sa-learn program to scan the messages in various Maildir folders on my server.  After sa-learn does its thing, the script moves messages from the Learn as Ham folder into the Trash folder.  The messages in the Learn as Spam folder get moved into the Junk folder.

Now in order to train SA on a spam email, I just move it to the Learn as Spam folder, and you can guess what I do with ham.  But to make things one step easier, I setup Apple Mail to use the Learn as Ham folder as the Trash folder.  This means that when I delete an email, it’s automatically moved into the Learn As Ham folder. Nice and easy.

As an additional bonus, the messages SA is trained on, don’t get deleted right away.  They go into either my Junk or Trash folders, which each retain messages for 30 days.  So if I accidentally file an email in the wrong place, I have 30 days to detect and fix the error.  I simply drag the message from where ever it is, into the appropriate Learn folder, and SA re-learns the email.

Sound useful?  Here’s the script.

WordPress

Posted 1 year, 9 months ago at 10:21 pm by Jennifer. 0 comments

I’d been avoiding WordPress because I tried it out a while back and found it rather unpolished compared to Movable Type. I’ve been using MT for years, and though I’m not the most prolific blogger I have managed to run the full gamut of emotion over MT. I used to think it was totally the cat’s pajamas, but the last few times I have tried to use it I have ended up tearing my hair out over bugs and idiosyncrasies. I couldn’t get even simple unordered lists to work in the Markdown editor, I can’t get any of the supposedly “universal” themes to work from The Style Contest… it was just getting ridiculous. I feel like Six Apart has completely abandoned MT. I mean, they have only a smattering of employees, yet they own MT, TypePad, Vox, LiveJournal… it’s ridiculous. Anyway, I am fairly pleased at how much nicer WordPress seems this time around. Even though I don’t always agree with Matt Mullenweg and I hear the WP codebase is kind of a pile, I am officially giving up on MT. For the time being.