To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.admin.suggestionsOpen lugnet.admin.suggestions in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Administrative / Suggestions / 1628
1627  |  1629
Subject: 
Improved email obfuscation
Newsgroups: 
lugnet.admin.suggestions
Date: 
Mon, 16 Jul 2007 01:07:45 GMT
Viewed: 
3694 times
  
I have a suggestion regarding the spam-prevention obfuscation that is applied to email addresses in http://news.lugnet.com message headers. I think the dummy phrases should not appear at the beginning or end of the address, only interposed between characters of the address.

I make this suggestion because when I search for my email address with Google, the only results whatsoever are on Lugnet news posts. When the dummy text appears at the end of an address, the address itself appears intact and perfectly valid. Furthermore, the dummy text is often demarcated by punctuation characters that rarely appear in email addresses, even if perhaps technically legal.

As a result, no special effort would be needed for an email harvesting crawler to recognize these cases. Indeed, my actual email address can be extracted from those pages Google turns up with the sample regular expression described at http://www.regular-expressions.info/email.html, a popular reference. For example:

#!/usr/bin/perl
$text = 'From: John Doe <john@doe.com#IHateSpam#> blah blah blah';
if ($text =~ m/\b([A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4})\b/i) {
 print "Found address: $1\n";
}

Running this script, which uses the generic regular expression, reports the following:

Found address: john@doe.com

In other words, that particular obfuscation technique does not fool even the simplest search strategy. Consider the following cases:
  1. <^SayNoToSpam^john@doe.com>
  2. <john@doe.com#IHateSpam#>
  3. <john@doe.SPAMLESScom>
Only the third case will not yield John Doe’s correct email address to the example script. What I suggest is retaining the last sort of obfuscation method and abandoning the “prefix” and “suffix” methods. I have no evidence to offer that this would constitute a real improvement other than the conclusion that the “obvious” obfuscation methods I’ve identified don’t seem like they would be very effective.

Thank you for your consideration, and thank you for Lugnet.

Jim



1 Message in This Thread:

Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR