Defeat Spam Blogs With IP Based Content Delivery
Posted October 17, 2009 – 18:20 in: Nullamatix, syndicatedThe majority of bloggers are forced to deal with spam blogs (splogs, aka scraper blogs), and even though a variety of counter measures exist, they just don’t seem to do the trick. Most of the time, splogs will scrape only an excerpt from the post, making the permalink at the bottom of the post useless. Some harvesting software is even smart enough to strip out these attempts at foiling the scrapers, so what’s a blogger to do? Today, I introduce a way to deliver entirely different content to these spammers via IP based content delivery.
First Things First
In order for this to work, we’ll need a list of IP addresses that known offenders use. For your convenience, I’ve compiled this massive list of 5,780 offending IPs (I highly recommend you use your own unique list compiled from your own server logs). Copy those IPs and save them to your server’s root directory with a filename of your choice. Remember the filename, you’ll need it in just a second. Now that you’ve got your enemy plotted, lets get to the code.
Backup, Modify, Test
Backup your theme’s single.php to single.post.original.txt or something of your choice. Now, open up single.php with a text editor and insert the code below at the very top of your single.php file.
Please note: I take absolutely no credit for this code. The original source is located here.
<?php
function chkiplist($ip) {
$lines = file("THE-FILENAME-OF-IP-LIST.txt");
$found = false;
$split_it = split("\.",$ip);
$ip = "1" . sprintf("%03d",$split_it[0]) .
sprintf("%03d",$split_it[1]) . sprintf("%03d",$split_it[2]) .
sprintf("%03d",$split_it[3]);
foreach ($lines as $line) {
$line = chop($line);
$line = str_replace("x","*",$line);
$line = preg_replace("|[A-Za-z$max = $line;
$min = $line;
if ( strpos($line,"*",0) <> "" ) {
$max = str_replace("*","999",$line);
$min = str_replace("*","000",$line);
}
if ( strpos($line,"?",0) <> "" ) {
$max = str_replace("?","9",$line);
$min = str_replace("?","0",$line);
}
if ( $max == "" ) { continue; };
if ( strpos($max," - ",0) <> "" ) {
$split_it = split(" - ",$max);
if ( !preg_match("|\d{1,3}\.|",$split_it[1]) ) {
$max = $split_it[0];
}
else {
$max = $split_it[1];
};
}
if ( strpos($min," - ",0) <> "" ) {
$split_it = split(" - ",$min);
$min = $split_it[0];
}
$split_it = split("\.",$max);
for ( $i=0;$i<4;$i++ ) {
if ( $i == 0 ) { $max = 1; };
if ( strpos($split_it[$i],"-",0) <> "" ) {
$another_split = split("-",$split_it[$i]);
$split_it[$i] = $another_split[1];
}
$max .= sprintf("%03d",$split_it[$i]);
}
$split_it = split("\.",$min);
for ( $i=0;$i<4;$i++ ) {
if ( $i == 0 ) { $min = 1; };
if ( strpos($split_it[$i],"-",0) <> "" ) {
$another_split = split("-",$split_it[$i]);
$split_it[$i] = $another_split[0];
}
$min .= sprintf("%03d",$split_it[$i]);
}
if ( ($ip <= $max) && ($ip >= $min) ) {
$found = true;
break;
};
}
return $found;
};
$status = chkiplist($_SERVER['REMOTE_ADDR']);
?>
Ok, now what? Change the third line so the filename you saved your IPs to is specified, then look for:
<?php the_content(); ?>
Immediately before that line, add something similar to the following:
<?php if ($status == 1): ?> Hey, thanks for scraping my post: <a href="<?php the_permalink(); ?>" title="<?php the_title(); ?>"> <?php the_title(); ?><br /> Click here to see the site this content was stolen from! <?php the_excerpt(); ?> <?php the_title(); ?></a> Original Source: <?php the_permalink(); ?> <?php else: ?>
The spam bots will see this:
Hey, thanks for scraping my post:
Defeat Spam Blogs With IP Based Content Delivery
Defeat Spam Blogs With IP Based Content Delivery
Original Source:
www.nullamatix.com/defeat-spam-blogs-with-ip-based-content-delivery/
We’re not done yet. To prevent any PHP errors, you’ll need to add this:
<?php endif; ?>
immediately after:
<?php the_content(); ?>
The whole thing, minus the chkiplist() function defined above, should look something like this:
<div class="entry-content"> <?php if ($status == 1): ?> Hey, thanks for scraping my post: <a href="<?php the_permalink(); ?>" title="<?php the_title(); ?>"> <?php the_title(); ?><br /> Click here to see the site this content was stolen from! <?php the_excerpt(); ?> <?php the_title(); ?></a> Original Source: <?php the_permalink(); ?> <?php else: ?> <?php the_content(); ?> <?php endif; ?> </div>
To test everything out and make sure your blog is up and running properly, just visit a post like you normally would. If the content is displays as usual, you’re good to go. To test the scraper’s view, just add your IP to the list of known spammers. The script above also supports wildcards, among other variations. Check out the original source mentioned above for more details.
To Conclude…
This won’t immediately work on every new splog that comes out, but if you actively check your server logs, you can stop most of ‘em by adding the offending IP(s). Now for the real question: what other purposes might this nifty little script serve? Just use your imagination – there is a hidden agenda behind this entire post ![]()
More From Nullamatix.com:
- January 15, 2008 — Make Money With a Poor Mans BANS (Build a Niche Store)
- November 27, 2007 — Dell Diagnostic Beep Code Troubleshooting Chart
- December 6, 2007 — GMail: Sorry, You Account Has Been Disabled
- January 26, 2008 — 1 Month Commission Junction Earnings Report
- December 12, 2007 — Free Web Site Reviews
Post Originally Published at: Nullamatix.com – Technology Made Simple
Post a Comment
You must be logged in to post a comment.



