Spam is one of the major pitfalls of the social web. According to sites such as Postini, 10 out of 12 email messages are spam. As if that weren't already enough to make you cringe, 1 in 39 emails contain a virus. Spam is penetrating into other regions of the Internet as well. The creators of the blogging software Wordpress report that nearly 87% of all blog comments are also spam. As messaging and communication applications proliferate throughout the web, developers and site owners have to get creative in the fight against the thousands upon thousands of unwanted messages streaming in every day. Deciding on the best method of spam prevention on your blogs, forums, or even contact forms can be difficult. In this article we will take a look at a service called Akismet and how it can help. We will also look at why some other methods of fighting spam fail.
Methods of Fighting Spam
Disallowing multiple consecutive submissions. Spammers almost always post more than one SPAM comment or message at a time. A common method for fighting spam is to log the incoming message with the user's IP address and a timestamp of the post. Then, when a user attempts to post multiple comments, you can check to see if the user has posted more than once within a specified window of time, for example 30 seconds, or if the current poster was also the last poster. This is not a bulletproof method because spammers can use proxies when they want to post multiple times, and robots have as much time in the world as they want to spam your site.

Keyword Blacklist. Another method of fighting spam is to build a blacklist of common spam keywords yourself and to disallow posts that contain the words. In its most simplest form, you can create an array of keywords and check to see if an incoming string contains them. Spammers have evolved defenses against this method by posting variations of the words. They replace letters with numbers, symbols, and other such characters to create a broad selection of keyword variations.
CAPTCHA. CAPTCHA (Completely Automated Public Turing Test) is one of the most common spam prevention techniques on the web today. The technique is very useful, and almost any site that allows you to register for an account or post information publicly uses CAPTCHA in one way or another. CAPTCHA tests can be audio files, but are more commonly images presenting a series of characters and numbers that you have to enter into a form. The technique is a useful tool for blocking robots that attempt to visit your site to post spam messages or create fake accounts with fake information.
CAPTCHA works well for its intended use, but there are minor drawbacks. A CAPTCHA requires (yet another) field for users to fill in after entering usernames, passwords, and security questions. There is understandably an annoyance factor accompanying their use. In addition, disabled users may not be able to use the CAPTCHA field. Finally, human spammers can also still spam your site because a CAPTCHA only blocks out robot spammers.

So What's Left?
Having reviewed some of the current methods and their weak points, you may be wondering what else we can do to protect our blogging applications. I would like to introduce a new spam fighting tool from the creators of WordPress. The service is called Akismet and is described by its creators as a "... collaborative effort to make comment and trackback spam a non-issue and restore innocence to blogging, so you never have to worry about spam again."The tool can be implemented in any project as long as you have an API key, which can be used free for non-commercial use or purchased for commercial use for as little as $5 a month. There are several Akismet plugins for existing software, and these are identified later in this article. Alternatively, you can include the service in your own projects as we will demonstrate.

Implementing Akismet in your Own Projects
As of now the only way to receive an API key is to sign up for a free WordPress.com user account. Turn your browsers towards http://wordpress.com/signup/ and fill out the normal required fields: username, password, and email as seen below and then read and agree to the terms of service agreement. Make sure that you register for a blog as you can not receive an API key without the registration. Don't worry about this detail, because the API key won't be tied to a specific blog. Once you have finished the registration process you should receive an email with your new API key.

You will now need to download and unzip PHP5Akismet.0.4.zip (24K) from Achingbrain. Upload the single php file to an area accessible by your scripts. The other files and documentation are just for reference.
We will assume that you are working with an existing project. This could be anything that allows user contributions such as a forum or blog. We will also assume that the logic for creating and displaying content already exists. With that in mind, our first step is to load the file into our own project.
include "path/to/file/Akismet.class.php";
Next we will need to create a new instance of the Akismet class. Using the classes constructor, we can pass our API key and the URL of the site using it. Make sure to replace the following data with your own.
$akismet = new Akismet( "http://myblog.com", "API KEY HERE");
Now the service needs the actual comment data that we want to check. In the following instance I am using some example data, but in production the comment information would derive from POST data. The Akismet service will then compare the comment information to a database of more than 7,486,928,953 spam comments and return a result if the submitted post has been identified as a spam comment.
$akismet->setCommentAuthor("Justin Shreve");
$akismet->setCommentAuthorEmail("test@test.com");
$akismet->setCommentAuthorURL("http://serenelabs.com");
$akismet->setCommentType("forums");
$akismet->setCommentContent("I really agree with what you are saying! I can't believe I never thought of that before!");
The functions presented here are quite straightforward. The only function that requires some further explanation is the setCommentType function. This is used by Akismet to help the service identify the origin of the comment (was it posted on a public newsgroup, forum, or blog?), and you can pass any argument you want. For example, if you are using the function to spam-proof a wiki, then use wiki as the type. If you are protecting a blog, then use a blog type.
Now we will use a function called isCommentSpam. This is the function that actually contacts the service. The boolean function will return true if the comment is identified as spam and false if the comment is verified as legitimate.
if( $akismet->isCommentSpam() )
{
// Here we can store logic to deal with spam comments.
// Usually we can store the comment internally for later reference just in case the service makes a mistake.
}
else
{
// This is where you would insert the content into the database.
}
Using Akismet is as simple as these few lines of code! You have now integrated a spam-fighting service into your site. The service can be used in conjunction with the other forms of spam defense mentioned earlier. Keep in mind that Akismet is a service that grows each time you use it because the functions contribute your spam content to the database. There may be valid messages sometimes identified as spam and vice-versa. As a result, we may want to integrate a little more functionality to deal with potential misidentification.
If a message is wrongly identified as SPAM, then you can notify Akismet, and they will deal with it accordingly. Alternatively, you can mark a comment as SPAM if it happened to fall through the Akismet filter. When implementing the following functionality, make sure that the comment data in the variables is set in the same format as above.
The function
$akismet->submitHam();
can be used to notify the service that the comment they reported as spam is actually ok.
While the function
$akismet->submitSpam();
can be used to notify the service that a comment that was approved actually is a piece of spam.
Other Libraries
PHP5 isn't for everyone. Akismet libraries have also been created in a slew of other languages. Below are a few of the most popular:
All of these can be easily integrated into your projects in much the same way as described above.
Popular implementations
Don't feel the need to roll your own software but still want to take use of Akismet? Many solutions already exist for blog, CMS, or forum software:
- Invision Power Board
- phpBB
- Drupal
- vBulletin
- Expression Engine
- and of course WordPress by default!
Closing
I hope that this guide will serve as an introduction into some alternative forms of spam combat. A site without SPAM not only appears more professional to users, but is also much easier to manage for administrators and moderators.
Related Posts
Check out some more great tutorials and articles that you might like










User Comments
( ADD YOURS )insic October 6th
perfect! currently im fighting spam of my website. lolz.
That Namelezz Guy October 6th
Oh yes, the good ol’ Akismet. Really helped me way back when I still had my old blog. Great article, if and when I start another forum or blog I’ll keep this article bookmarked if I need spamfiltering.
Roshan October 6th
cool, nice post. I have also implemented Akismet plugin on my website.
Thanks for additional info.
Roshan
BroOf October 6th
Nice and useful tips. Perfect for me thanks!
BroOf
graphic design October 6th
Spam is indeed a huge problem that is costing million (if not billions) of dollars to corporate business the world over. In a survey done in 2007 it was revealed that 95% of all emails sent in 2007 were spam, I couldn’t believe it 95% seems a staggering amount.
Personally I wonder why we email servers actively help spammers. Ever sent an email to an incorrect address? I know I have and I know this because the email server response advising that this email address is incorrect. Well why that is indeed helpful for myself, it is extremely useful for spammers as well. Spammers lists are gathered for directories, paid submissions services, and guess work. And how do they keep their spam lists update. Well I don’t know for certain but I can guess that email servers are playing their part! Serious jail time should be implemented in my opinion, really it is a form of cyber attack on businesses, especially those filled virus and the like. Prehaps I am going over the deep end but its about time people in authority give this the attention it deserves!
Kevin Quillen October 6th
Akismet is really good. I never get any spam on my website anymore.
James October 6th
Aksimet is awesome but it does let a few non-spam comments slip through sometimes - I’m glad it has a good feedback system!
Jon October 6th
Thanks for the article, but you missed the whole 2nd half of the CAPTCHA acronym. It’s actually “Completely Automated Public Turing test to tell Computers and Humans Apart” as defined by Wikipedia.
Lamin Barrow October 6th
Great article. Aksimet is a very popular spam fighter and it works. I will implement the .Net library and see how it works outside of Wordpress.
Thanks for the post.
Jim Neath October 6th
http://snook.ca/archives/other/effective_blog_comment_spam_blocker/
Snook’s way of combating spam is pretty good.
Eric October 6th
Good article, but I definitely wouldn’t call Akismet “new”…
Bart October 6th
Mollom (http://mollom.com/) is also a great way to fight spam.
David Dashifen Kees October 6th
Image based CAPTCHA systems create problems for the visually impaired since solutions that would read them the characters in the image would also provide an attack vector for a spam bot. Thus, I’ve found some success with a two-fold system: an arithmetic CAPTCHA and a honeypot.
The arithmetic CAPTCHA is a simple addition or subtraction problem. I generate two random numbers between zero and nine and then randomly select between addition and subtraction. Put the arithmetic problem on-screen and ask the visitor to solve it storing the two numbers and the operation to be calculated on the server-side and compared against the visitor’s answer. This one isn’t foolproof, though, as I’ve gotten some spam through systems like this.
Thus, a honeypot field which is printed in the form and then hidden with CSS. Since the field is hidden, people won’t be able to put anything in the field. But, spam bots don’t usually come with a full CSS interpreter so they’ll still “see” the field and many will put something in it simply to fill in all fields. But, the server-side process for your form should refuse to send any message with content in the honeypot. Thus, a bot filling in the field is the _opposite_ of what a spammer would want to do in order to get their message sent.
Unfortunately, this also has some problems with respect to universal accessibility screen-readers may still read CSS hidden content to a visually impaired visitor. Thus, it’s important to include some instructions regarding how not to use the honeypot in the label for the field which will ensure that screen reader users at least know what they shouldn’t be doing. It may be an inconvenience, but hopefully they’ll respect the need to combat spam.
The benefit of these is that they don’t require plug-ins so if you’re working within a structure that doesn’t allow for such things, you can implement these two reasonably simply in the language of your choice.
stelabouras October 6th
For spam protection, I prefer using reCaptcha API. Simple and effective!
Aaron Irizarry October 6th
Askimet has been very helpful in monitoring spam on the company blog I manage. Before using it i had to go into the comment moderation section and mark each spam comment, it was long and tedious.
Great article I am sure many will find it useful.
~ Aaron I
Marc Grabanski October 6th
Ask a super simple question that bots cannot answer. “Is fire hot or cold?” is what I ask on my website. I have not received a spam comment yet after using this method.
Nate October 6th
Thanks! I get like crap comments every day. This will be a big help.
ignite October 6th
Just a note about wordpress.com. Once you create an account, you can’t delete it. No where does it say this but I’ve tried. Not until you try to do you find out that they don’t want you to. Just FYI.
James October 6th
I agree with David, the traditional CAPTCHA presents major usability issues and to be honest I will always opt for the arithmetic approach.. just seems nicer (for the user)…IMO
Aditya October 6th
I love Mr. David Kees’s solution. Definitely, it will work. I don’t see any potential pitfall in that method at all! Thank you, Mr. Kees.
Ben Griffiths October 6th
Some great stuff here, many thanks
Gyorgy Fekete October 6th
Check out my 5 security layer spam filter. It uses Akismet too and a lots of other goodies and the best of it doesn’t use CAPTCHA.
You can check it out at: http://blog.primalskill.com/?p=268
Anthea October 6th
I use David’s solution, as was recommended by one of the developers where I work.
Not sure if “don’t fill me” instructions in the label for your “honeypot” (nice name btw) would be effective?
I’ve previously called mine something like “comment” or “message” to attract the spambots.
Jordan October 6th
You may want to note that MyBB also comes by default with an Akismet plugin.
Evan October 6th
I had recent success with the honeypot form field. It seems to work well enough to protect against the random bot spam.
martin October 6th
i want to know why youre post image for this article is a fragment of what looks like Polish text…
Justin Shreve October 6th
I think I just searching for a public domain image of some text (with an effect like the one above) to use as a post image.
Shane October 7th
Akismet“s been a real life saver for me, but the accessibility issues of CAPTCHA are well documented, and for those reasons, I avoid it.
Good list though, thanks for posting.
Stefan October 7th
How about a hidden captcha? …A normal captcha system, only it gets completed by a JavaScript. Then the captcha image and text field are hidden by the same JavaScript.
If it’s a spambot or a regular user without JavaScript they’ll get asked to complete the captcha. Otherwise they won’t be bothered.
I wrote an article about in on http://www.vileworks.com/using-captcha-without-displaying-it
crysfel October 7th
this is so usefully!! thanks
Mark October 7th
Defensio FTW! It’s way better than Akismet and Mollom.
Rahul October 8th
I’ve used Akismet together with Maths spam checker due to which i do not have to spend hours deleting spam. Before that I used to get more than 1000 spam per day. Now only few, just one or two, gets through and which falls into Akismet
So, all the hard work is done by the good plugins.
Christian October 10th
I’m using the following system:
I’v added a additional hidden field, where i wrote “You mustn’t type in here something”. Most of the Bots is writing something in every field, and since i’m using this method, i’ve no problem with Spam!
Jammin October 15th
Nice post, I’ve just installed CAPTCHA onto a clients website due to the spam becoming a bit uncontrollable and it works perfectly. Cheers!
ddcm October 16th
please check otherinbox
Add Your Comment
( GET A GRAVATAR )Your Name November 21st
Trackbacks