Matt Mullenweg, founder of the popular open source weblog software Wordpress, and CNET employee, has admitted to gaming the web's search engines by hosting tens of thousands of "articles" that contain hidden, paid-for keywords.
Mullenweg hosted at least 160,000 pieces of "content" on his site wordpress.org which use a cloaking technique to hide keywords such as "asbestos", "debit consolidation" and "mortgages". Mullenweg was paid a flat fee by Hot Nacho Inc., which creates software for search engine gamers to use. It's been dubbed "Adsense bait" - Adsense is Google's keyword-based classified advertising service.
On its site, Hot Nacho boasts that its software generates a higher placement in the search engines.
"Content written in ArticleWriter easily and naturally score 4-500% better in search-engine results. For the first time ever, high-quality articles written by real field professionals will outscore low-quality content generated by web developers and automated scripts. And that is the way things should be."
We wonder if that's a battle Tim Berners Lee's could possibly have imagined when he helped invent the world wide web. But it's the stark reality of the web today, where spam pays and a vast industry of Google gamers has sprung up overnight, seeking to place their clients' high up in the search results.
In a neat bit of vertical integration, Hot Nacho also pays writers Bangalore-style word rates for content to fill out the garbage web pages: as little as $3 for a 300 word article. Another piece of Hot Nacho software collates the garbage into a coherent spam campaign -
"ArticlePublisher receives XML article documents submitted by Editors and transforms them into fully optimized web pages utilizing all the latest SEO knowledge. Articles are then routed out to participating websites. A simple website-specific template allows tremendous site-based formatting flexibility. Features such as RSS-feed generation and keyword interlinking are automatically integrated."
So it's nice to know that they're fully buzzword-compliant.
"Democracy on the web works," tootles Google on a page entitled 'Our Philosophy' - but this depends how many times you can vote, and it's a race in which gamers have stayed one step ahead of the search engines.
Google has said it's a clear violation of its rules, and it will delete all the pages in the articles subdirectory.
A new dimension to blog spam
Reaction was negative.
"It’s destructive, and should be criminal," writes Peter da Silva. "Getting blocked by google should be considered getting off lightly."
"This stinks like last week's fish," writes Jason Kottke. "Contributing to spam noise on the web is annoying." It's a little more grave than an annoyance - there are economic implications that affect many webloggers and the major search engines, as we'll see.
"Blog software is already accused of being responsible for dilluting search results and stuff like this throws oil into the fire," notes Manuel Almedia, who sees consequences for the good name of GPL software.
Mullenweg is on holiday from his job at CNET this week, but in an earlier exchange with Waxy.org, appeared unaware of the ethical dimension.
He said the scam "isn't something I want to do long term ... but if it can help bootstrap something nice for the community, I'm willing to let it run for a little while." His partner in the spam caper was in denial today, and pleaded exhaustion.
Meanwhile, we'll be checking to see if Mullenweg hasn't been using the technique in his day job.
The future of Google and Yahoo! is closely tied to contextual advertising, and forms a significant part of their revenue. Both owe their astronomical growth to the service, and a number of small web sites have found Adsense revenue a valuable supplement. Last year, a Google executive admitted that the company viewed combatting abuse of its Adsense program as its number one concern. If it fails to thwart the spammers, then the much vaunted "Long Tail" could turn into the "Long Trail of Smoke". ®
Bootnote: Mullenweg employed "negative positioning", which uses a CSS directive to place the text offscreen, out of sight of the user, but where search engines can still read it.