Full-up Google choking on web spam?

Buddy, can you spare a server?


Webmasters have been seething at Google since it introduced its 'Big Daddy' update in January, the biggest revision to the way its search engine operates for years.

Alarm usually accompanies changes to Google's algorithms, as the new rankings can cause websites to be demoted, or disappear entirely. But four months on from the introduction of "Big Daddy," it's clear that the problem is more serious than any previous revision - and it's getting worse.

Webmasters now report sites not being crawled for weeks, with Google SERPS (search engine results pages) returning old pages, and failing to return results for phrases that used to bear fruitful results.

"Some sites have lost 99 per cent of their indexed pages," reports one member of the Webmaster World forum. "Many cache dates go back to 2004 January." Others report long-extinct pages showing up as "Supplemental Results."

This thread is typical of the problems.

With creating junk web pages is so cheap and easy to do, Google is engaged in an arms race with search engine optimizers. Each innovation designed to bring clarity to the web, such as tagging, is rapidly exploited by spammers or site owners wishing to harvest some classified advertising revenue.

Recently, we featured a software tool that can create 100 Blogger weblogs in 24 minutes, called Blog Mass Installer. A subterranean industry of sites providing "private label articles," or PLAs exists to flesh out "content" for these freshly minted sites. And as a result, legitimate sites are often caught in the cross fire.

But the new algorithms may not be solely to blame. Google's chief executive Eric Schmidt has hinted at another reason for the recent chaos. In Google's earnings conference call last month, Schmidt was frank about the extent of the problem.

"Those machines are full," he said. "We have a huge machine crisis."

And there's at least some anecdotal evidence to support the theory that hardware limitations are to blame.

"The issue I have now is Googlebot is SLAMMING my sites since last week, but none of it makes it into the index. If it's old pages being re-indexed or new pages for the first page, they don't show up," writes one webmaster.

The confusion has several consequences which we've rarely seen discussed outside web circles.

Giving Google the benefit of the doubt, and assuming the changes are intentional, one webmaster writes: "In which case Google's index, and hence effectively 'the Web as most people know it' is set to become a whole lot smaller in the coming weeks."

It's barely more than a year since Yahoo! and Google were engaged in a willy-waving exercise to claim who had the largest index. (See My spam-filled search index is bigger than yours!)

Now size, it seems, doesn't matter.

There's also the intriguing question raised by search engines that are unable to distinguished between nefarious sites and legitimate SEO (search engine optimization) techniques? The search engines can't, we now know, blacklist a range of well-establish techniques without causing chaos. In future, will the search engines need to code for backward bug compatibility?

And lingering in the background is the question of whether the explosion of junk content - estimates put robot-generated spam consists of anywhere between one-fifth and one-third of the Google index - can be tamed?

"At this rate," writes one poster on the Google Sitemaps Usenet group, in a year the SERPS will be nothing but Amazon affiliates, Ebay auctions, and Wiki clones.  Those sites don't seem to be affected one bit by supplemental hell, 301s, and now deindexing."

With $8 billion in the bank, Google is better resourced and more focussed than anyone - but it's still struggling. Financial analysts noted that its R&D expenditure now matches that of a wireline telco.

Only a cynic would suggest that poor SERPs drive desperate businesses to the search engines own classified ad departments - so if you want to play, you have to pay. Banish that unworthy thought at once.

(Thanks to Isham Research's Phil Payne for the tip).®

Bootnote: Something called OneWebDay - we're not kidding - is encouraging you to celebrate the web with a "special hand signal - you extend your middle three fingers and have your thumb and little finger touch in a circle. Not the gesture many webmasters are making this week.


Other stories you might like

  • Chip shortage forces temporary Raspberry Pi 4 price rise for the first time

    Ten-buck increase for 2GB model 'not here to stay' says Upton

    The price of a 2GB Raspberry Pi 4 single-board computer is going up $10, and its supply is expected to be capped at seven million devices this year due to the ongoing global chip shortage.

    Demand for components is outstripping manufacturing capacity at the moment; pre-pandemic, assembly lines were being red-lined as cloud giants and others snapped up parts fresh out of the fabs, and the COVID-19 coronavirus outbreak really threw a spanner in the works, so to speak, exacerbating the situation.

    Everything from cars to smartphones have been affected by semiconductor supply constraints, including Raspberry Pis, it appears. Stock is especially tight for the Raspberry Pi Zero and the 2GB Raspberry Pi 4 models, we're told. As the semiconductor crunch shows no signs of letting up, the Raspberry Pi project is going to bump up the price for one particular model.

    Continue reading
  • Uncle Sam to clip wings of Pegasus-like spyware – sorry, 'intrusion software' – with proposed export controls

    Surveillance tech faces trade limits as America syncs policy with treaty obligations

    More than six years after proposing export restrictions on "intrusion software," the US Commerce Department's Bureau of Industry and Security (BIS) has formulated a rule that it believes balances the latitude required to investigate cyber threats with the need to limit dangerous code.

    The BIS on Wednesday announced an interim final rule that defines when an export license will be required to distribute what is basically commercial spyware, in order to align US policy with the 1996 Wassenaar Arrangement, an international arms control regime.

    The rule [PDF] – which spans 65 pages – aims to prevent the distribution of surveillance tools, like NSO Group's Pegasus, to countries subject to arms controls, like China and Russia, while allowing legitimate security research and transactions to continue. Made available for public comment over the next 45 days, the rule is scheduled to be finalized in 90 days.

    Continue reading
  • Global IT spending to hit $4.5 trillion in 2022, says Gartner

    The future's bright, and expensive

    Corporate technology soothsayer Gartner is forecasting worldwide IT spending will hit $4.5tr in 2022, up 5.5 per cent from 2021.

    The strongest growth is set to come from enterprise software, which the analyst firm expects to increase by 11.5 per cent in 2022 to reach a global spending level of £670bn. Growth has fallen slightly, though. In 2021 it was 13.6 per cent for this market segment. The increase was driven by infrastructure software spending, which outpaced application software spending.

    The largest chunk of IT spending is set to remain communication services, which will reach £1.48tr next year, after modest growth of 2.1 per cent. The next largest category is IT services, which is set to grow by 8.9 per cent to reach $1.29tr over the next year, according to the analysts.

    Continue reading
  • Memory maker Micron moots $150bn mega manufacturing moneybag

    AI and 5G to fuel demand for new plants and R&D

    Chip giant Micron has announced a $150bn global investment plan designed to support manufacturing and research over the next decade.

    The memory maker said it would include expansion of its fabrication facilities to help meet demand.

    As well as chip shortages due to COVID-19 disruption, the $21bn-revenue company said it wanted to take advantage of the fact memory and storage accounts for around 30 per cent of the global semiconductor industry today.

    Continue reading
  • China to allow overseas investment in VPNs but Beijing keeps control of the generally discouraged tech

    Foreign ownership capped at 50%

    After years of restricting the use and ownership of VPNs, Beijing has agreed to let foreign entities hold up to a 50 per cent stake in domestic VPN companies.

    China has simultaneously a huge market and strict rules for VPNs as the country's Great Firewall attempts to keep its residents out of what it deems undesirable content and influence, such as Facebook or international news outlets.

    And while VPN technology is not illegal per se (it's just not practical for multinationals and other entities), users need a licence to operate one.

    Continue reading
  • Microsoft unveils Android apps for Windows 11 (for US users only)

    Windows Insiders get their hands on the Windows Subsystem for Android

    Microsoft has further teased the arrival of the Windows Subsystem for Android by detailing how the platform will work via a newly published document for Windows Insiders.

    The document, spotted by inveterate Microsoft prodder "WalkingCat" makes for interesting reading for developers keen to make their applications work in the Windows Subsystem for Android (WSA).

    WSA itself comprises the Android OS based on the Android Open Source Project 1.1 and, like the Windows Subsystem for Linux, runs in a virtual machine.

    Continue reading
  • Software Freedom Conservancy sues TV maker Vizio for GPL infringement

    Companies using GPL software should meet their obligations, lawsuit says

    The Software Freedom Conservancy (SFC), a non-profit which supports and defends free software, has taken legal action against Californian TV manufacturer Vizio Inc, claiming "repeated failures to fulfill even the basic requirements of the General Public License (GPL)."

    Member projects of the SFC include the Debian Copyright Aggregation Project, BusyBox, Git, GPL Compliance Project for Linux Developers, Homebrew, Mercurial, OpenWrt, phpMyAdmin, QEMU, Samba, Selenium, Wine, and many more.

    The GPL Compliance Project is described as "comprised of copyright holders in the kernel, Linux, who have contributed to Linux under its license, the GPLv2. These copyright holders have formally asked Conservancy to engage in compliance efforts for their copyrights in the Linux kernel."

    Continue reading
  • DRAM, it stacks up: SK hynix rolls out 819GB/s HBM3 tech

    Kit using the chips to appear next year at the earliest

    Korean DRAM fabber SK hynix has developed an HBM3 DRAM chip operating at 819GB/sec.

    HBM3 (High Bandwidth Memory 3) is a third generation of the HBM architecture which stacks DRAM chips one above another, connects them by vertical current-carrying holes called Through Silicon Vias (TSVs) to a base interposer board, via connecting micro-bumps, upon which is fastened a processor that accesses the data in the DRAM chip faster than it would through the traditional CPU socket interface.

    Seon-yong Cha, SK hynix's senior vice president for DRAM development, said: "Since its launch of the world's first HBM DRAM, SK hynix has succeeded in developing the industry's first HBM3 after leading the HBM2E market. We will continue our efforts to solidify our leadership in the premium memory market."

    Continue reading

Biting the hand that feeds IT © 1998–2021