WebAIM Blog

Spam-free accessible forms

March 7, 2007

There has been much discussion lately about how to prevent spambots from submitting forms on web sites. Many solutions have been presented, many of which impact the usability and accessibility of the web page. CAPTCHA is a classic case where the user and accessibility is directly impacted.

Note

A Brazilian-Portuguese translation of this blog entry is available at http://www.maujor.com/tutorial/spam-em-formularios.php.

Over the last year or so I have compiled the following basic techniques for blocking spam submission in web forms. I’ve implemented just a couple of these and through logging have found that they have effectively reduced around 99% of spambot submissions while having no or very little impact on the usability or accessibility of the forms. Nearly all of these techniques are performed server-side using PHP and the relevant PHP code is shown below, however, the tests can be readily implemented in nearly any server-side scripting language.

Disclaimer 1: These spam prevention techniques may not work for enterprise level application where spammers may target forms specifically. They are intended for generic contact, comment, or registration forms where a spammer is less likely to take the time to try and bypass your specific spam prevention mechanisms.

Disclaimer 2: These techniques primarily stop bots and automated spam submission programs. They also can filter certain content. However, they likely will not prevent an actual dedicated human from posting spam to your web site.

The techniques are:

  • Detect spam-like content within submitted form elements
  • Detect content within a hidden form element
  • Validate the submitted form values
  • Search for the same content in multiple form elements
  • Generate dynamic content to ensure the form is submitted within a specific time window or by the same user
  • Create a multi-stage form or form verification page
  • Ensure the form is posted from your server

Detect spam-like content within submitted form elements

This technique is likely the most powerful spam prevention technique. Most spam bots are in existence to either post URL’s of web sites in an effort to increase traffic or increase their search engine ranking or they are attempting to hijack your form to send spam messages to you or others. Detecting commonly used spam content or e-mail header injections will stop nearly all spam bots dead in their tracks.

The following PHP code, when placed on your form processing page (the place where the form is submitted to), will search all of the form elements for the most common header injections and other code that may trick your mail processor into sending carbon copy or blind carbon copy messages to others. It also detects any content that includes the string “[url” which is used by most forum software to specify links. If any are found, it sets the $spam variable to true.

if (preg_match( "/bcc:|cc:|multipart|\[url|Content-Type:/i", implode($_POST))) {
    $spam=true;
}

NOTE: Internet Explorer 6 has a bug that will not allow proper overflow of preformatted text. If you are still using that browser, you will need to properly reflow the PHP code lines from this page.

You can also detect links and urls within the form elements. The following will set the $spam variable if more than 3 instances of “<a” or “http:” appear anywhere within the form.

if (preg_match_all("/<a|http:/i", implode($_POST), $out) > 3) {
    $spam=true;
}

This will defeat most spambots as they primarily focus on posting links or hijacking your mail script. Beyond this, some very basic word filtering can often catch spam that finds its way through.

$spamwords = "/(list|of|naughty|spam|words|here)/i";
if (preg_match($spamwords, implode($_POST))) {
    $spam=true;
}

You can also use external spam detection services with up-to-date patterns of spam content. My favorite is Akismet. Akismet is commonly used for filtering spam on blog comments (it has blocked nearly 14,000 spam comments to this blog in the last 9 months!), but it can be used successfully for nearly any web form.

Detect content within a hidden form element

Most spambots will find your form, determine what the form element names are, and find the URL where the form is posted to. The software will then post those form elements with modified, spam-filled values back to the form submission URL. Typically, the bot will populate every form element with some value so as to best ensure that it will succeed in being posted. So, if you insert a standard text input element into your form, but hide it visually from the user so the user cannot enter anything into this field, it is quite likely that the spambot will still post some value for this form element. If you detect that the form element is submitted with a value, then it’s almost certainly a spambot.

For instance, your form element may be inserted as

<span style="display:none;visibility:hidden;">
<label for="email">
Ignore this text box. It is used to detect spammers.
If you enter anything into this text box, your message
will not be sent.
</label>
<input type="text" name="email" size="1" value="" />
</span>

Notice that CSS is used to hide the text input and its label from view. This code will also hide this content from modern screen readers. However, if CSS is disabled, the input will still be displayed. For this reason, an explanatory label is provided that informs the user to not enter anything into the text box. I also gave the input a nice, juicy, tempting element name of “email” - that’s almost certain to get the spambots to enter a value.

You then simply detect if the form element is empty. If it is not, then it’s either a spambot or a user that has CSS disabled and did not follow the label instructions.

if(!empty($_POST['email'])) $spam=true;

This tactic, like all of those listed here, should still present a useful, informative error message in case the user somehow triggers your spam detection flag.

Validate the submitted form values

This one perhaps goes without mentioning, but if you want certain form elements to be required, ensure that you are using a server-side script to detect if information has been entered into those form fields. If you require form information is a particular format (such as requiring a valid e-mail address), then validate it. Many bots will simply submit empty information for fields they do not recognize or will submit random information for certain form fields. Your standard form validation mechanisms can stop many spambots.

// If the message is empty, throw an error
if(empty($_POST['message'])) $error=true;
// if e-mail is not formatted correctly, show error message
if(!eregi("^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$", $_POST['email']))) {
    echo "Please enter a valid email address.";
}

Search for the same content in multiple form elements

Some spambots will post the same text into all unrecognized form fields. If you have two form fields that should never contain the same information, you can detect if their values are indeed the same and if they are, you can flag an error. On our forum registration form, I found that simply throwing an error if the first and last names were the same cut down on bot registrations by around 80%. It’s not a perfect technique and you should ensure that the fields you analyze should always be unique (I guess there is still a chance that a person could have the same first and last name, huh?).

    if($_POST['firstname'] == $_POST['lastname']) $spam=true;

Generate dynamic content to ensure the form is submitted within a specific time window or by the same user

By generating unique form elements or creating session variables, you can ensure that the person that visits your form page is the same one that submits the form. For instance, when a form is accessed, you could use server scripting to write the current time to a hidden form element. When the form is submitted, you can compare the hidden form value with the current time and ensure that no more than, say, an hour has elapsed. The likelihood of a spambot generating the correct value for the time form value is very unlikely. You can also set browser cookies or use other client sessioning systems to ensure that a user session is established and maintained between the form page and the submission page.

The following will write the current time in UNIX time format to a hidden form input.

<input type="text" name="formtime" value="<?php echo time(); ?>" />

When the form is submitted, you can measure the difference between the current time and the value stored within the form. If the time difference is more than a specified value, you can flag it as spam. In this example, if more than an hour (3600 seconds) has elapsed between the time the form was viewed and the time it was submitted, the spam variable is set. This code will also flag the message as spam if the formtime value has been changed to some other value, such as a URL or an e-mail address.

if($_POST['formtime'] < time()-3600)  {
    $spam=true;
}

Create a multi-stage form or form verification page

By creating a multiple stage form process, most spambots will be unable to find the actual script that processes the final form data. This can be as easy as having the user verify their input after submitting a form and then selecting a second button to actually submit the form elements for processing. This can be made even more foolproof if the original form and the verification page are processed at the same URL. If the form element data is stored server side before the final verification step (rather than in hidden form elements that can be submitted by the spambot), it becomes very difficult for an automated system to submit the form.

if($formsubmitted == true) {
    // database the form elements and display the verification page.
    // If the user verifies the form information, then process the databased data.
}
else {
    // display the empty form
}

Ensure the form is posted from your server

Because most spambots post to your form script from a remote computer, by detecting if the form information has been submitted from your own web site, you can stop many spambots from submitting the form to your processing script. Most scripting programs can check the page referrer, or the page that was used to get to the current page. It’s important to note that it is quite easy for spambots to forge the referrer information to appear as if the form is coming from your web server. Also, some browsers and firewalls will not send the referrer header at all.

The following code will check to ensure that the page referrer (incorrectly spelled ‘referer’ in the HTTP spec and in PHP) exists, and if it does, that the referring page is on the same web site as the form processing script. For browsers or spambots that send no referrer information, the message is never flagged as spam.

if((isset($_SERVER['HTTP_REFERER']) && stristr($_SERVER['HTTP_REFERER'],$_SERVER['HTTP_HOST']))) {
    $spam=true;
}

Conclusion

Preventing spam submissions to web forms is difficult work. However, when possible we should not place the burden of preventing spam on the end user through CAPTCHA or other turing tests. Any time it becomes the user’s responsibility to somehow manually prove that they are a human, accessibility will be decreased. These techniques offer several methods of filtering out most form spambots without placing any burden on the end user.

I’m sure these are not all of the possibilities and it’s likely that there are flaws in my techniques above. If you have comments or better techniques, please post them below.

67 Responses to “Spam-free accessible forms”

  1. Joshue O Connor Says:
    March 7th, 2007 at 3:04 pm

    Thanks Jared, thats a a useful collection of techniques and its good to see them complied in one place.

    Josh

  2. Ivan Pepelnjak Says:
    March 12th, 2007 at 2:52 pm

    Just in case someone figures out that you’re timestamping the submission form (or use any other dynamic content generation technique), generate MD5 or SHA hash of (dynamic content + server-side-secret), store it into a hidden field and check it on form submission (similar to what’s done in IPSec or any other digital signature method).

    Also make sure your dynamic content prevents replay attacks (resubmitting of the same content); timestamps are usually the best solution.

  3. Marc Says:
    March 18th, 2007 at 1:18 am

    Fine list of tips, thanks.

    I have found, however, that testing $_SERVER[’HTTP_REFERER’]) makes it impossible for users with Norton firewall or similar software that blocks communication of the referer value, from filling out the form!

  4. Jared Smith Says:
    March 18th, 2007 at 8:12 pm

    As long as the firewall either blocks the HTTP_REFERER altogether or sends the correct referer, then there shouldn’t be a problem with this approach. This code only flags it as spam if the referer is present AND is not the same as the web site. However, if the firewall send an incorrect header (which it should never do), then there might be problems. I need to do some more looking into this.

  5. Marc Says:
    March 19th, 2007 at 3:24 am

    Well, as I understand it, that makes the HTTP_REFERER approach pretty useless - a pirate isn’t going to send that, so the result will be the same as a firewall blocking that info. It’s really a shame, because otherwise this seemed to be the most elegant approach to this headache!

  6. Fred Says:
    March 20th, 2007 at 5:47 am

    A colleague pointed me to this article, and it is pretty useful - thanks for the tips, which are in essence simple but wouldn’t always occur to web admins. For instance, checking for “http://” in the submitted data, which is obvious in retrospect but it just hadn’t occurred to me before.

  7. PHPdummy Says:
    March 21st, 2007 at 6:22 pm

    sooo … what does one then do with $spam=true; ??
    at the risk of irritating those who know it all, i respectfully request a short sample of how this fits in with an overall script. we’re not all admins; some of us are self-taught n00bs who, at best, learn by copy/paste.

  8. Jared Smith Says:
    March 21st, 2007 at 8:49 pm

    That’s a good question PHPdummy. I’m actually working on an entire form processing script that includes these code options as well as several accessibility features for error recovery and validation. I hope to have things packaged up and will make them available soon.

    In short, if $spam is true, then you want to alert the user of the problem, identify the form fields that were problematic, provide easy access to those fields to allow the user to fix the problems, then allow them to re-submit the form. So, that’s not a super easy copy/paste type of thing. :-) But my forthcoming script will have all of this.

    A very basic solution would be to use PHP echo to display the error message on the same page as the form. Something like…
    if ($spam == true) {
    echo(”<p>Your message appears to be spam and was not processed. Please remove all links, code, and other spam-like content from your message and resubmit the form.</p>”);
    }

  9. Graham Says:
    April 2nd, 2007 at 7:29 am

    Will you be publishing the mail script on this blog?

  10. My del.icio.us bookmarks for April 5th | Uncertainty Lane Says:
    April 5th, 2007 at 6:37 pm

    […] WebAIM: Blog - Spam-free accessible forms - […]

  11. Woody Sabran Says:
    April 18th, 2007 at 5:58 am

    This is a really great article, I’m tempted to list it on Digg… would that be ok? (Don’t want the Digg effect to affect the site adversely!)

  12. Jared Smith Says:
    April 19th, 2007 at 4:12 pm

    Woody-

    I just added the article to Digg - http://digg.com/programming/Spam_free_accessible_forms Please Digg there or at the top of the article.

  13. Carl Avery Says:
    April 21st, 2007 at 6:30 pm

    I have a quick question, on my form submission page, which for me is insert.php, I want to know where does those if statements go, do they go anywhere inside of the Or does it have to go in a certain spot of the code Like at the end or the very beginning. I am very new at this and just got my form spammed and it is very frustrating. Can I also use multiple if statements within the php code as well, like for example can I use the first two examples you gave within the same php code area?

    Let me know,
    Carl (thanks)

  14. Peter Says:
    April 23rd, 2007 at 8:51 am

    I’m attempting ascertain why the following code won’t set the $spam flag if the following occurs:

    if (preg_match( “/bcc:|cc:|multipart|[url|Content-Type:/i”, implode($_POST))) {
    $spam=true;
    }
    if (!preg_match( “/bcc:|cc:|multipart|[url|Content-Type:/i”, implode($_POST))) {
    $spam=false;
    }

    As far as I can tell, even putting bcc: or cc: into the form on purpose doesn’t set the flag to true. The onl time it occurs is if I reduce the string match to just somethinng like /bcc\:/i which is not good enough in this case. Am I missing some logic here? Thanks for any help that can be provided.

  15. Jared Smith Says:
    April 23rd, 2007 at 9:02 am

    Carl-

    The PHP code can be placed anywhere within your page. In this case, it should probably go toward the top of the page and then based upon what happens, you can then display an error message to the page. Something like…

    if ($spam==true;) echo(”Your message appears to be spam.”);

    And yes, you can use as many if statements as you would like.

  16. Peter Says:
    April 23rd, 2007 at 9:03 am

    Sorry, the last statement should say “elseif”.

  17. Carl Avery Says:
    April 24th, 2007 at 7:57 am

    Jared,
    For some reason, i am having troubles getting any of the if statements to work, I am not sure if i am putting it to the right form, so if its not to much trouble I was wondering if you might be willing to see what I am doing wrong. Here is the code I am using on my page. I have people send their information from a questionnaire page, then it gets put into the database by using insert.php page. I am assuming that was the page you mentioned to be the form processing page. Here is the code i have on that page, see if you can see what I have done wrong.

    3) {
    $spam=true;
    }

    if((isset($_SERVER['HTTP_REFERER']) && stristr($_SERVER['HTTP_REFERER'],$_SERVER['HTTP_HOST']))) {
    $spam=true;
    }

    if(!empty($_POST['email2'])) $spam=true;

    if (!mysql_query($sql,$con))
    {
    die('Error: ' . mysql_error());
    }
    echo "Thank you for your submission. Please view the Directory to see your information, and your classmates as well.”;

    mysql_close($con)
    ?>

    If you could debug this in laymans terms I would appreciate it.
    Thanks Carl

  18. Carl Avery Says:
    April 24th, 2007 at 8:01 am

    Sorry the first part of my code didnt make the last post.
    Here it is

    $con = mysql_connect("p41mysql3.secureserver.net",$username,$password);
    if (!$con)
    {
    die('Could not connect: ' . mysql_error());
    }

    mysql_select_db($database, $con);

    $sql="INSERT INTO student_information (LastNameSchool, LastNameCurrent, FirstName, SpouseName, StreetAddress, City, State, Zip, PhoneArea, PhonePrefix, PhoneSuffix, EmailAddress, FamilyInfo, PersonalInfo, Retrospect, Photo)
    VALUES
    ('$_POST[lastnameschool]','$_POST[lastnamecurrent]','$_POST[firstname]','$_POST[spousename]','$_POST[streetaddress]','$_POST[city]','$_POST[state]','$_POST[zipcode]','$_POST[areacode]','$_POST[prefix]','$_POST[suffix]','$_POST[email]','$_POST[familyinfo]','$_POST[accomplishments]','$_POST[retrospect]','_$POST[photo]')";

    $spamwords = "/(list|of|naughty|words|here)/i";
    if (preg_match($spamwords, implode($_POST))) {
    $spam=true;
    }

    if (preg_match_all("/ 3) {
    $spam=true;
    }

    if((isset($_SERVER[’HTTP_REFERER’]) && stristr($_SERVER[’HTTP_REFERER’],$_SERVER[’HTTP_HOST’]))) {
    $spam=true;
    }

    if(!empty($_POST[’email2′])) $spam=true;

    if (!mysql_query($sql,$con))
    {
    die(’Error: ‘ . mysql_error());
    }
    echo “Thank you for your submission. Please view the
    Directory to see your information, and your classmates as well.”;

    mysql_close($con)
    ?>

  19. Peter Chau Says:
    April 27th, 2007 at 4:03 am

    Carl, you are correct in that this code is to be included on the PHP page that is used to process the submitted form information, but look at your code carefully.

    You have condition statements that set the $spam flag to be true if it encounters any of those conditions. On its own however, the $spam flag is not much use. You have to use it as part of another condition statement that uses the value to either process the rest of the form or to stop it in its tracks if it encounters spam related content. You will need something like this:

    if(!spam)
    {

    }
    else
    {
    echo “Sorry, but your message appears to be spam.”;
    }

    So basically, you want to check to see if the $spam variable is true and if it is, not to bother running the other database processing scripts.

  20. Peter Chau Says:
    April 27th, 2007 at 4:06 am

    Man… The if section is blank? Anyway, within the if statement you see there, you put all your form/database processing scripts in there.

  21. Peter Chau Says:
    April 27th, 2007 at 4:07 am

    Corrected:

    if(!$spam)
    {
    $con = mysql_connect(”p41mysql3.secureserver.net”,$username,$password);
    … etc…
    }
    else
    {
    echo “Sorry, but your message appears to be spam.”;
    }

  22. Ken Mayer Says:
    May 1st, 2007 at 7:13 am

    I tried using the statements to set the $spam variable, but keep getting:

    Warning: preg_match() [function.preg-match]: Compilation failed: range out of order in character class at offset 32 in /home/.demitri/golem_herald/heralds.westkingdom.org/ProcessAwardInfo.php on line 64

    Line 64 is:

    if (preg_match( "/bcc:|cc:|multipart|[url|Content-Type:/i", implode($_POST)))

    What is weird is that sometimes it seems to work in my testing of the script and web page. The error that I want displayed displays, and the email is not sent (the one the form is building). Any suggestions? (I am checking with the administrator of the server/website to see if he can point me at anything that might help as well). The warning is a bit disconcerting …

  23. Jared Smith Says:
    May 1st, 2007 at 8:46 am

    Ken-

    It appears your installation of PHP is interpreting the - in “Content-Type” to be a range identifier. You should (haven’t tested to be sure) be able to fix this by escaping the - (e.g., Content\-Type).

  24. Ken Mayer Says:
    May 1st, 2007 at 10:30 am

    Hmm. Well, I tried adding the backslash in there, and it’s still giving the warning … however, it’s now a little different:

    Warning: preg_match() [function.preg-match]: Compilation failed: missing terminating ] for character class at offset 38 in /home/.demitri/golem_herald/heralds.westkingdom.org/ProcessAwardInfo.php on line 64

    (I appreciate the assistance!)

  25. Ken Mayer Says:
    May 1st, 2007 at 10:40 am

    A bit more experimentation, and changing the statement at line 64 to:

    if (preg_match( "/bcc:|cc:|multipart|[url]|Content\-Type:/i”, implode($_POST)))

    MAY have fixed this. I am not sure. I do not understand regular expressions, so cannot guarantee it. But it may be fixed now. (I added the second square bracket after “url” in the line above)

  26. Ken Mayer Says:
    May 1st, 2007 at 10:45 am

    (I don’t want to appear to be spamming your blog here …) The above posting is not quite correct, as it is now finding *all* posts that I try as spam. I am outputting the value of the $spam variable, and it’s coming out True, whether or not there’s a URL in the data. I am wondering if perhaps removing the “[url]” would work … Nope. That didn’t do it either. I am completely stumped here. Sorry …

  27. Kia Says:
    May 1st, 2007 at 11:17 am

    This worked for me…

    if (preg_match(”/bcc:|cc:|multipart|\[url|Content\-Type:/i”, implode($_POST)).

    i.e. since we want “[url” we slash the “[” the same way that “-” is slashed.

  28. Ken Mayer Says:
    May 1st, 2007 at 11:28 am

    Kia — I will try that. Hmm. Weirder and weirder. Now I get no syntax errors in the preg_match function, but I am getting:

    Parse error: syntax error, unexpected '{' in /home/.demitri/golem_herald/heralds.westkingdom.org/ProcessAwardInfo.php on line 65

    All that is on line 65 is the { which I could have sworn was necessary …

    (Thanks!)

    I’m not real familiar with all the ins and outs of PHP, but …

  29. Ken Mayer Says:
    May 3rd, 2007 at 11:45 am

    FWIW, I gave up on the preg_match() function. It may simply be that there is some problem with the PHP interpreter being used on the server for the site I’m having this problem with. I did find a workaround, in case anyone is interested — it’s more lines of code, but it seems to be working fine:

    if ( strpos( strtoupper( $_POST['SCAName'] ), "CONTENT-TYPE:" ) False ||
    strpos( strtoupper( $_POST['SCAName'] ), "A HREF=" ) False ||
    strpos( strtoupper( $_POST['SCAName'] ), "HTTP:" ) False )
    {
    $spam="True";
    }

    The strpos() function returns false if a string (second parameter) is not contained in the larger string (first parameter). To avoid any case issues, I am using the strtoupper() function which converts the string to upper case, so I can compare the two only once. The three strings that seem
    to be the biggest problems and appear in most of these spams are: “Content-Type:”, the “a href=” tag, and “http:”. The difficulty is that any entry area that can accept input must be checked, so this IF statement is repeated several times for my web form.

    However, testing by filling the fields as they might normally be filled in, and testing them by using values in one of the spams I was getting, shows this is working.

    I hope someone finds this useful …

  30. Skylog » Blog Archive » links for 2007-05-05 Says:
    May 5th, 2007 at 12:19 am

    […] Spam-free accessible forms (tags: internet) […]

  31. Lori O. Says:
    May 7th, 2007 at 12:56 pm

    Thank you so much for posting this. We’ve been getting slammed with spam and
    I’ve been looking high and low for an alternative to CAPTCHAs.

    Thank you! Thank you! Thank you!

  32. Marky D Says:
    May 9th, 2007 at 3:38 am

    I am very pleased with this list.
    I was thinking about implementing a very simple SPAM check (together with regexp solution you suggested).

    Here is how it works:
    - Set a Session variable with a random nummber in it.
    - Check on the confirmation page where the form is checked and processed if this Session corresponds with the Session set earlier on.
    - If it doesn’t…

    This could work or am I missing something?

  33. Jared Smith Says:
    May 9th, 2007 at 9:13 am

    Marky-

    Session variables will work great, except for in the few cases where the end user blocks cookies. I don’t know that I’d set a session variable only for a form - it seems like a lot of overhead to me. But if you’re using session variables for anything else on the site, this approach would work great.

  34. RadiantMatrix Says:
    May 16th, 2007 at 1:43 pm

    I agree with you on form validation, but the regular expression you suggest to validate e-mail addresses is infuriating: many savvy users (myself included) use e-mail addresses with ‘+’ in the address-part. With many mail systems, that allows a user to cause all messages to that address be sent to the folder named either before or after the ‘+’. Sites that don’t let me use this frustrate me, and I will often just avoid commenting/buying/whatever.

    The ‘=’ is another possible separator of this type.

    So, while validation is helpful, it’s important to think it through so it doesn’t alienate legitimate users.

  35. ann Says:
    May 20th, 2007 at 9:53 pm

    I cannot figure out how to figure out the spam code if it is true. Do you have your article out yet that includes some code to try out? I cannnot get my condition to work our right if the spam is true.
    This is what I have:
    {
    if(!empty($_POST[’email1′])) $spam=true;
    {echo “You must enter a value for email1, otherwise your entry is viewed as spam.”;
    exit;}

  36. RvnPhnx Says:
    May 21st, 2007 at 2:12 pm

    Nice to see some inspired php code once in a while. I would like to note a few things, however.
    1. ALLWAYS validate form input from the real world, and always SCRUB user input before feeding it back.
    2. Your hidden form field test, while accessible in spirit, does not strictly comply with section 508 (when interpreted most strictly). Yet another reminder that strict compliance is often not the point.
    3. Good thinking about blocking on mail headers. The idea had crossed my mind, but I hadn’t come across a good reason to use it yet (non-RFC compliant mail gets trashed by our systems at work). I’ll probably throw it in just to see what I catch.
    4. The PHP developers recommend using “preg_match” over “eregi” for speed of execution reasons (and frankly it is often just easier to write PERL-style regexes anyway).
    5. Your e-mail checker is likely to block certian classes of legitimate mail addresses and my not survive internationalization of e-mail addresses (when they get done fighting about how to do it). Unfortunately a lot of us are going to have to deal with that. I looked to RFC 3696 (an informational) for the inspiration behind my also mildly flawed e-mail address checker.
    6. Asking users to take one last look at their input before “final submission” is always a good idea for things which will be difficult to change. Unfortunately it can take a while to convince non-tech people of this. A link back to the original form (to edit the pre-selected input) is often a good idea when the review page is not also an edit page.

  37. RvnPhnx Says:
    May 24th, 2007 at 7:30 am

    @ann:
    Try this:

    if(!empty($_POST[’email1′])){
    echo “You must enter a value for email1, otherwise your entry is viewed as spam.”;
    exit;
    }

    (and then finish the code so that it writes a valid page)

  38. Mike Says:
    June 18th, 2007 at 6:01 am

    This was very useful. Thanks for taking the time to put this stuff together!

  39. emlak Says:
    July 15th, 2007 at 2:50 pm

    Will you be publishing the mail script on this blog?

  40. Manokaran Says:
    July 24th, 2007 at 12:58 am

    This spam-free-accessible-forms tutorials gave lot of information to me.. i really
    wish to say Thanks..!

  41. Gayrimenkul Says:
    August 1st, 2007 at 10:28 am

    I am very pleased with this list.
    I was thinking about implementing a very simple SPAM check (together with regexp solution you suggested).

  42. Fred Riley Says:
    August 15th, 2007 at 7:00 am

    This is a nice script, as I’ve commented before, and thanks for Kia’s intervention on the PHP interpreter errors which helped. I’ve now got this working ok on a form on the site above. One thing I’ve noticed in recent spam, though, is that the spambots aren’t putting “http://” in any more, but “ttp://”, presumably in the hope that whatever client receives the email will interpret this as a URL anyway. An example is:

    message_subject: Garrison chocolate Providence RI Nursehentai

    So instead of detecting “http” in the script, perhaps it would be better to look for “href”? Mind you, the search for

  43. Fred Riley Says:
    August 15th, 2007 at 8:00 am

    My previous comment got a bit truncated on account of including naughty proto-HTML tags in it. The example spam didn’t show up well, and the last line should have referred to the regex search for the anchor tag. Sorry :(

  44. txn Says:
    August 31st, 2007 at 12:38 pm

    Ok, I am stuck dealing with a Frontpage-created form (I know, I know). I can put the CSS code in there that hides the email entry box — but where would I put the following line?

    if(!empty($_POST[’email’])) $spam=true;

    The page is in .html and has Javascript for validating if the form fields are empty. But how would I use this one as a standalone script?

  45. Rick Hill Says:
    October 1st, 2007 at 12:06 pm

    RvnPhnx notes that the hidden field approach is not strcitly 508 compliant. Why not?

  46. Jared Smith Says:
    October 1st, 2007 at 1:37 pm

    Rick-

    I don’t see any Section 508 issues with the hidden form field approach.

    There are two 508 provisions that could apply. One says the page must be readable with style sheets disabled. In this case, the form field would be displayed - but with an adequate label (”Don’t enter anything in this textbox.”), I don’t see how this would be an issue. Besides, this provision deals with readability only, not forms navigation.

    The second simply requires that forms be accessible to those using AT. This certainly is.

  47. Rick Hill Says:
    October 1st, 2007 at 3:28 pm

    Jared,

    I agree. Just checking to see if there was something i was missing.

  48. tercüme Says:
    October 8th, 2007 at 7:02 am

    This code only flags it as spam if the referer is present AND is not the same as the web site. However, if the firewall send an incorrect header (which it should never do), then there might be problems.

  49. übersetzungen Says:
    October 17th, 2007 at 5:07 am

    A colleague pointed me to this article, and it is pretty useful - thanks for the tips, which are in essence simple but wouldn’t always occur to web admins.

  50. Ranjith Says:
    October 22nd, 2007 at 8:06 am

    Great article - simple, yet effective steps to tackle Spam.

    Yet another simple technique i use: Send an email to the person who filled the form asking him to click on a link to confirm if he indeed submitted it. Serves to validate the email as well.

  51. Trish Says:
    October 24th, 2007 at 7:18 am

    I cannot get any of these to work properly. I either get undefined variable
    errors with the preg_match example or parse erros with the strpos example.
    I am just so new at php, that I really need to see a full example in context
    to make things work.. even if it’s a simple page. Does any one have that?

  52. Polin Armsley Says:
    October 29th, 2007 at 1:27 pm

    Well, as I understand it, that makes the HTTP_REFERER approach pretty useless - a pirate isn’t going to send that, so the result will be the same as a firewall blocking that info. It’s really a shame, because otherwise this seemed to be the most elegant approach to this headache!

  53. matt Says:
    December 20th, 2007 at 11:21 am

    If you are using frontpage forms, simply make one of the fields required. This will at least prevent blank forms.

  54. Fred Says:
    January 25th, 2008 at 8:26 pm

    Ken (and others),

    I use this for catching the errant spam character.

    foreach($_POST as $key => $val) {
    if ($key != ‘message’) {
    if (stristr($val,’
    ‘)) $spam++;
    if (stristr($val,’
    ‘)) $spam++;
    if (stristr($val,’%0A’)) $spam++;
    if (stristr($val,’%0D’)) $spam++;
    }
    if (stristr($val,’<a’)) $spam++;
    if (stristr($val,’content-type’)) $spam++;
    if (stristr($val,’mime-version’)) $spam++;
    if (stristr($val,’cc:’)) $spam++;
    }

    …or some variant thereof in most of my contact forms. I’ve also started to employ the use of the occasional hidden-by-css field (that is not always in the same place, in conjunction with random field names), and something like sha1(md5(”".gmtime().mt_rand(9999,99999))). Call me paranoid if you wish, but I ain’t gettin’ no spam (of course, there is probably the occasional legit message not getting through)!

  55. Fred Says:
    January 25th, 2008 at 8:27 pm

    looky there. should have escaped the newline chars. sorry folks. those were \r and \n

  56. Wynajem Autokarów Says:
    February 12th, 2008 at 4:40 am

    Great script, simple and easy to integrate. Thanks a lot!

  57. Richard Says:
    March 25th, 2008 at 7:57 pm

    Quite good. I use some similar ideas. (If == yes then is spammy for the following examples.)

    TIME CHECK

    On form page:
    $start_time = $_SESSION[’start_time’] = time();

    In processor script:
    $timeSubmit = time();
    if ($timeSubmit - $_SESSION[’start_time’] < 5)

    IP CHECK

    On form page:
    $ip1 = $_SESSION[’ip1′] = $_SERVER[’REMOTE_ADDR’];

    On processor page:
    $ip2=$_SERVER[’REMOTE_ADDR’];
    if ($_SESSION[’ip1′] !== $ip2)

    HTTP CHECK
    On processor page:
    if ( (preg_match(”/http/i”, $name)) || (preg_match(”/http/i”, $subject)) || (preg_match(”/http/i”, $message)) )

    Using techniques like these along with the usual CSS-driven bogus captcha fields, ordinary validation, etc., it’s still possible to build an accessible form that is almost bulletproof.

    There are many other techniques, as well. None are particularly difficult. They just require thinking about the goals of a spammer and the differences between human and robot behavior, and condensing that information into code.

    One thing I heartily recommend is to land a spambot on the exact same success page that a successful submission would land on, to avoid triggering human review of the failure. Just send the bot to the success landing page, but kill the script before the mail is sent. For example:

    if (is spammy)
    {
    print “”;
    die;
    }

    Best,

    Richard

  58. Vic Says:
    April 4th, 2008 at 9:49 pm

    Copy and pasted from above.

    For instance, your form element may be inserted as

    Ignore this text box. It is used to detect spammers.
    If you enter anything into this text box, your message
    will not be sent.

    1. Can the about be put in a file?

    ———————————–

    Copy and pasted from above.

    You then simply detect if the form element is empty. If it is not, then it’s either a spambot or a user that has CSS disabled and did not follow the label instructions.

    if(!empty($_POST[’email’])) $spam=true;

    This tactic, like all of those listed here, should still present a useful, informative error message in case the user somehow triggers your spam detection flag.

    2. Should this [if(!empty($_POST[’email’])) $spam=true;] be put in a javascript, a php script, or a perl script?

    If it is true it is spam, then send it to a different action url such as <form name=”Spam” action=”http://www.url.com/cgi-bin/spam.cgi

    or a redirect script

  59. Andres Says:
    April 14th, 2008 at 8:23 am

    Hello all! Yes, the ’spam’ topic is so annoying for me also - these letters coming in churns and all the things - it was horrible for my work.. I used Barracuda (too expensive), then SpamAssasin (not really that effective), then Postini but the same result and now I’m trying to use Gafana.com - having a trial period but it sounds like a good service - no spam at all, AT ALL!!!! No false positives either. Ok with the price. So, I’m inclined to have a long-term relationship with it. Ifanyone of you has any suggestions, please write them, would be really interesting!

  60. TheJoe Says:
    April 23rd, 2008 at 7:19 pm

    I’m a spammer.. ahah!! and i’m posting here my spam message! XD

    seriously.. really interesting post.. i think i’m gona use theese lines of code in my site..

  61. Creating Spam Free Contact Pages for Your Website « JungleGeorge’s Weblog Says:
    May 2nd, 2008 at 11:31 am

    […] Spam-free accessible forms by Jared Smith- techniques for creating a spam-free contact form. […]

  62. FredB Says:
    May 5th, 2008 at 4:06 pm

    First of all, thanks for helping stop the spam problems out there today. I am not a php programmer but do use phpformmail for my clients forms. So I pasted the following line (the first one you offer at the top of this page:
    ——
    if (preg_match( “/bcc:|cc:|multipart|\[url|Content-Type:/i”, implode($_POST))) {
    $spam=true;
    }
    ——
    Here’s my dumb question: This code flags the message as spam but what else do I have to do? How does formmail know what to do when $spam=true; is detected? Is there more code necessary to tell formmail what to do when $spam=true; Something like “if this is spam do this.”
    OR…is this covered in the “implode ($POST))) statement?

    Hope this makes sense and thanks again.

    Fred

  63. Jared Smith Says:
    May 5th, 2008 at 6:02 pm

    FredB-

    This assumes that you will have some other logic later in the file that checks for the spam variable and displays an error message if it is true and sends the e-mail (or whatever) if it is not true.

    Something like:
    if ($spam==true) {
    echo(”I’m sorry, but this message appears to be spam.”);
    }
    else {
    // send the e-mail message and show a success message.
    }

  64. FredB Says:
    May 5th, 2008 at 10:47 pm

    Thanks Jared, I’ll add that statement to my php file.

  65. Dennis Belmont Says:
    May 16th, 2008 at 3:03 pm

    I just used the “hidden field” method to great success. Instead of a blog post form, this one actually sent an email to my client AND a harvested email address. After implementation, spam immediately ceased (after having 90,000 in a week, my client was very happy to stop receiving the).

    I included a message for disabled CSS users stating not to fill in the field, or enter the word “human”.

    (Since the site is for the Disabled American Veterans, it was extremely important for the fix to be accessible.)

  66. Malliobiana Says:
    June 15th, 2008 at 1:54 pm

    A combination of Captcha and email activation is most effective, as it is too troublesome for the mass offenders, and even the minor offenders. Serious commenters will give real email addresses and will not mind, proud that their important contribution is recognized.

  67. tigra Says:
    July 4th, 2008 at 9:14 pm

    Hi, i was wondering if anyone can help me; I have tried putting the following on our form processing page to stop spammers leaving links in our posts;

    if (preg_match(”/bcc:|cc:|multipart|\[url|Content\-Type:/i”, implode($_POST))) {
    $spam=true;
    }
    if ($spam == true) {
    echo(”Your message appears to be spam and was not processed. Please remove all links, code, and other spam-like content from your message and resubmit the form.”);
    }

    but we are still receiving tons of spam which just link to other websites, can anyone tell me how to stop people posting website links on our site. I am also using recaptcha but that doesnt seem to be stopping them either.

    I am fairly new to php so maybe i have entered something wrong??? - the spam links are driving me insane, can someone please help.

Leave a Reply


WebAIM is an initiative of:
Center for Persons with Disabilities (CPD) Utah State University