Something has come to my attention, brought sharply into focus yesterday by a couple of revelations regarding this web site and spam traffic. It seems that my website, your website, potentially any web site that is accessible through RSS and more specifically, Google RSS syndication and reader, is potential bait for the newest, and in my opinion, lowest form of internet thievery.

The topic of people stealing someone else’s content and using it as their own has been around for a while and quite the hot button topic as of late. All sorts of web pundits have weighed in on what it is, how annoying it is, and how we need better anti-spam features to combat it. Some even offer suggestions on what do to about it. The most common answer is to contact their host and they’re ad revenue source.

I say all this because over the past week or so, I’ve been victim to this exact crime, and not by a person, but by a web scraping script.

The sad thing is that I’ve tracked down the exact person responsible and their host refuses to do anything. It might actually take legal action to resolve this issue.

I’ve decided to lay it all out for you here, in hopes that someone else might be able to prevent this from happening to them.

At the start, I received a few trackback spams, nothing out of the ordinary. I simply deleted them out of my moderation que, just like any normal person would do. Then I got some more, and I started to realize they were all coming from the same link. Being the curious sort, I followed the link to a website where, so far, three of my posts are being used, in excerpt form, to generate ad revenue for someone else. Immediately angered, I traced their site to their host, LiquidWeb. I contacted Liquidweb yesterday about the stolen content and link spamming. I made it very clear that stealing content is against copyright laws and falls under the DMCA. The “abuse” department at Liquidweb told me that they saw nothing wrong with the posts and that it wasn’t their problem. I’ve since sent even more pointed emails back to them but have yet to hear anything.

Still wanting to get to the bottom of this, I starting poking around their website further. I was being ripped off by a sub-domain, so I went a level up. It’s there that I read the following:

“Go away! This WPMU installation is private. You can’t sign up, and there’s nothing to read here. This is my experimentation blog. A place where I can test things out without any outside interference. Okay. Bye now.”

And, on a second page…

“I have to laugh a little, as I never envisioned someone bothering to read my “Go Away” post. Lo and behold, Dan of Dan Q’s Blog did and was even kind enough to link to me. So, I figured that I could return the favor.

He mentions the weird names of authors in the posts. I get a lot of comments about that. To be honest, the script I’m using for the auto-posting was not written by me, so I’m not sure where it gets those names.”

This tells me two things. First, that I’m not alone is being ripped and second, that it’s a script written specifically for auto-posting. This link to Dan’s website, which I kept in the quote, says basically what I’m saying now. That they’ve had posts duplicated and spammed by trackbacks. The website in question on Dan’s website is in fact the exact same one I’m dealing with. I won’t give them the satisfaction of linking to it.

Digging deeper, there was an update to Dan’s post:

Apparently the mastermind behind the whole scam (handle “SEO_Mike”) explains it here.

Well, now that’s just the jackpot in terms of information. According to SEO_Mike, there’s quite a bit of money to be made by “RSS scraping and auto blogging” as he calls it.

They even decide to run a little contest and see who can make the most money. There are quite a few interesting tid bits you can gleam from that forum:

“I’m going to participate in this as well using a WordPress mU setup”

“The blogs I just set up are off the top of my head or ideas taken from various sources. SearchEngineWatch has a good list of sites to get ideas from.”

“I’m going to be using Adsense primarily for these blogs, and some targetted CPA ads (mostly from Copeac, of course). The site is setup on a LiquidWeb VPS account, so I should have plenty of room to grow.”

“The goal of the sites is to make money. Period. So don’t get too attached to one site / one idea. Diversify. This is about numbers and doing something.”

I’m so angry, I actually feel physically ill. I haven’t felt this mad since someone threatened a member of my family when I was a teenager. I’m going to keep hounding Liquidweb, as they are directly responsible for this website being operational. I’m also going to pound on Adsense and Copeac and get their ad revenues pulled.

I can’t actually express in words my anger at the moment. This bullshit has got to stop. It’s time for the internet citizens to get their pitchforks and torches.