08/12/05

More anti-spam coding

Permalink 07:42:19 pm, Categories: Programming, 379 words  

Since the original "Junk Removed From Logs" post, I've still had referers getting through and I've had some comment spam as well.

Although it's always going to be an uphill battle, I'm willing to keep blocking different keywords if it means less junk in the logs and comments.

Some of the words I've blacklisted so far include (but are not limited to):

  • realty
  • prescription
  • mortgage
  • medications
  • pharmacy
  • texas-holdem and texas-hold-em
  • brand names I know and things that are probably brand names but I've never heard of e.g. cialis, viagra, amoxil, meridia, prilosec, propecia, prozac and phentermine

This list might ignore some valid referers, but I'd prefer that over referer spam in my logs. The blacklist has also been applied to the URL field of the comment box, so any spam in there will get a "sorry, the URL is potential spam" message if they try to submit spam comments. That one's the safer one, as although someone might blog "how to stop texas-holdem spam" and have it in a referer address, they're less likely to have it in their real URL that they enter for making a comment.

Now to leave it a while and see how the extra keywords go, adding more when I find what gets through :)

Update (the next day): Last night's bombardment of spam (including some comment spam) appears to have been fought off without a single casualty! Not even a single comment or referer got through that shouldn't have :) Hopefully the trend will continue with the comparative lack of successful spammage.

Update (11th December): Two nights and only two bits of spam got through :) One was from a completely impossible to detect domain with no give away page (hoodia.op-clan.com) and the other has what appears to be another brand name that is now getting blocked. Approximately a dozen referer spams have been blocked, though, possibly a little more.

Update (12th December): One more comment got through last night, although it had no URL in it anywhere, just lots of variations on "buy phentramine" or something. One referrer got through again, but again it was with a brand name of drug so it won't get through again. Soon enough I'll know all of the drug brand names and have them all blacklisted!

Comments, Trackbacks:

Trackback from: IBBoard [Visitor]
Antispam working well
Since the original Anti-spam coding post about my new regex filters, it has been working very well. I've added a few more brand names to the regex as they've sneaked through, and a filter on the comment text so that words such as 'incest' don't get thr...
Permalink 02/01/06 @ 11:55

Navigation