Wiki Spam, saga continues

Ever since I implemented the ~SpamFilter module for JSPWiki, the WikiSpam situation has improved dramatically. It works in two ways: first, it checks the submitted text against a list of regular expressions (typically domain names, but this is user-editable). This is what most blacklists do. In addition, it also has a limit how many pages the user can edit in one minute. If the user submits more than X number of edits, their IP address gets automatically blacklisted for a limited period of time.

In case the user is blacklisted or submits a blacklisted URL, he gets redirected to a page called "RejectedMessage", which describes the reason for the rejection of the edit. Most bots (and clueless spammer slaves, working in Brazil or China or wherever, and submitting spam manually) will continue to attempt editing this page, but since they are already blacklisted, they'll keep failing.

In addition, all the non-current revisions of pages at have the Google rel=nofollow attribute set, so any WikiSpam that goes to the repository has no impact on search engine rankings. The spam is relatively trivial to remove as well, as one single spammer usually makes only about four-five changes to the site before getting blacklisted. They want to work fast to spam as much as possible, and this system forces them to work slow...

Of course, all this means that RejectedMessage has become the most accessed page in the history of JSPWiki. That's fun.


No comments yet.
Add comment.
More info...     Comments?   Back to weblog
"Main_blogentry_270505_2" last changed on 27-May-2005 12:42:38 EEST by JanneJalkanen.

My latest photos