Scramble a JavaScript string in PHP

Web pages often need to show email addresses. Those email addresses need to be protected against email miners who add them to spam databases. Email miners are automated programs (or bots) who scan and harvest websites for various things, one popular target is email addresses. There are simple tricks to mislead the majority of email miners. One simple trick is to blend HTML with a little JavaScript so that email addresses are there but they are wrong, and the JavaScript addition makes it right. The idea is that email miners scan the HTML part of the website but they don't usually run the JavaScript scripts. So they will read whatever is in HTML, not what is dynamically compiled in JavaScript.

<p>If you require more information please <a href="mailto:spam@a.domain.that.doesnt.exist.and.will.never.exist" id="a">email me</a> your question.</p>

<script type="text/javascript">
document.getElementById("a").setAttribute("href", "mai" + "lto:real.ema" + "il@exa" + "mple.c" + "om");
</script>

This script does the job. It is an example where HTML contains an email but it's wrong. JavaScript that follows fixes the email address. All this happens in a few milliseconds so it is invisible to human users, in other words user experience unaffected.

Notice the string in JavaScript. It reads "mailto:real.email@example.com" but it's split so that email miners find hard time in harvesting that.

Email miners come in various different makes, some of them are completely stupid, some more intelligent. This article will show you a way to fight email miners up to a medium intelligence level. That is by scrambling the composition of "mailto:real.email@example.com" even more than in our previous example. This little script does the magic:

function js_string_scramble($input) {
    $result = '';

    while ($input != '') {
        if ($result != '') {
            $result .= ' + ';
        }

        switch (mt_rand(0, 2)) {
            case 1:
                $result .= 'String.fromCharCode(' . ord($input[0]) . ')';
                $input = substr($input, 1);
                break;

            case 2:
                $len = mt_rand(1, 3);
                $str = substr($input, 0, $len);
                $input = substr($input, $len);

                $i = mt_rand(0, strlen($str) - 1);

                $result .= "'" . substr($str, 0, $i) . sprintf('\\u%04x', ord($str[$i])) . substr($str, $i + 1) . "'";
                break;

            default:
                $len = mt_rand(1, 3);
                $str = substr($input, 0, $len);
                $input = substr($input, $len);

                $result .= "'$str'";
        }
    }

    return $result;
}

Here is an example on how to run the script:

echo js_string_scramble('mailto:real.email@example.com');

And here is an example no how you can change our first example that uses the scrambled strings:

<p>If you require more information please <a href="mailto:spam@a.domain.that.doesnt.exist.and.will.never.exist" id="a">email me</a> your question.</p>

<script type="text/javascript">
document.getElementById("a").setAttribute(<?php echo js_string_scramble('mailto:real.email@example.com'); ?>);
</script>

In case you wonder how the script's results looks like, here are a few examples. Notice that this script makes random outputs and every time you run it you get a different result.

// output 1
'mai' + '\u006c' + 'to' + ':' + '\u0072' + 'e' + 'al' + String.fromCharCode(46) + 'em' + String.fromCharCode(97) + 'i\u006c' + String.fromCharCode(64) + '\u0065' + '\u0078' + 'a\u006d' + '\u0070' + 'le\u002e' + 'com'

// output 2
String.fromCharCode(109) + 'ai' + 'l' + 'to:' + '\u0072' + 'ea' + String.fromCharCode(108) + String.fromCharCode(46) + String.fromCharCode(101) + '\u006d' + String.fromCharCode(97) + 'il' + '\u0040e' + 'x\u0061m' + 'pl' + 'e.c' + 'om'

// output 3
'ma' + String.fromCharCode(105) + String.fromCharCode(108) + 'to:' + 're\u0061' + 'l\u002e' + 'e\u006da' + '\u0069' + String.fromCharCode(108) + '@e' + 'xam' + 'p\u006c' + 'e' + '\u002e' + 'co' + '\u006d'