SpamAssassin Maintenance and Plugin Development
One of the areas that I work in most frequently is spam filtering, and a lot of my professional life has been devoted to not just setting up and maintaining filtering systems, but also increasing their accuracy by developing spamassassin plugins and feedback mechanisms
As well as writing custom rules, I've authored several custom plugins for clients (sadly none were made public - either similar plugins existed, or they were too client-specific to be suitable for widespread use). These include:
- A more flexible OCR (Optical Character Recognition - for detecting spam text in images) plugin, which took into account image type, dimensions, and had a more flexible approach to scoring.
- Detection of the sender's operating system using p0f (passive fingerprinting). This allowed us to penalise Windows Vista/XP/9x (since the majority of mail servers use either some flavour of UNIX/Linux, or server-orientated version of Windows)
- Scanning of MS Word documents, PDFs etc for spammy content
- Detection and filtering of so called scatter back (a spammer sends mail with your address as the sender, you receive all the bounce messages)
Mail servers come in all shapes and size, but I have experience in the following setups:
- Postfix + SpamAssassin
- Exim + SpamAssassin
- Qmail + SpamAssasin (via Qmail-Scanner and .qmail) + Dspam
- Sendmail + MIMEDefang + SpamAssassin
The latter was a heavily modified setup, with a web front-end which allowed users to control most aspects of their filtering (whether to use Bayes, which DNSBLs to use, required score, whitelisting, etc).
Of particular interest was a series of modifications I applied to Plesk to allow 1) mail over a certain threshold to be moved to a separate mail directory, and 2) allow webmail users to feed mail to the Bayes database via 'report as spam' style buttons.
If you have a mail server that you'd like me to take a look, feel free to get in touch; I've yet to come across a server on which I was unable
to make a significant impact on the amount of spam being received by end users.
Services
Code
- Ghoti: IRC Client for X11
- Dialog Quiz
- Apache Fingerprinting: mod_pof
- mod_miserable (Apache)
- Website Performance Testing
- Firefox Toolbar Tutorial
- SEO Postcodes (OS Commerce)
vBulletin
Data
Fun Stuff
pete@linuxbox.co.uk
Linuxbox.co.uk