Sun packs 150 billion web pages into meat locker

Getting your arms around the internet


Sun follows Google

But as Page reached for a patent on the idea, Kahle's brainstorm sparked two other minds over at Sun Microsystems. In the past, Sun has said that its Modular Datacenter - originally code-named Project Blackbox - grew out of a discussion between Sun chief technology officer Greg Papadopoulos and Danny Hills, now co-chairman and chief technology officer of a California consulting operation called Applied Minds. But this morning, Papadopoulus acknowledged that project sprung from Kahle, whom he had worked with at the Cambridge supercomputer maker Thinking Machines.

Wayback Machine - cables on movable trays

Blackbox racks are tracked. And cables too

"Danny Hills and I developed the original concept for Project Blackbox, but our inspiration was Brewster - the first person we know of to say 'Hey, we should put a bunch of circuit boards in a shipping container and blow cold air over them,'" Padadopoulus told the gathered digerati this morning in the heart of California's Silicon Valley.

Padadopoulus and company officially announced the Sun Modular Datacenter, or Sun MD, in January 2008. And according to Jud Cooley, the project's director of engineering, Sun has shipped its shipping containers "in the low double digits" to operations as far flung as the Radboud University Nijmegen Medical Centre in the Netherlands and the Belgian wind turbine outfit Hansen Transmissions.

Wayback Machine - smoke detector

Blackbox fire suppression

And now it's hosting the Wayback Machine in a container tucked between the Spanish tile roofs of its Santa Clara campus, just down the road from Google. Measuring 20 feet longer by 8 feet deep by 8 feet high, the modular net history holds two petabytes of data - with space for another two.

Sun's cramped container includes eight server racks on sliding tracks, each racking nine Sun Fire x4500 "Thumper" servers running Solaris 10 and Sun's ZFS file system. And the necessary networking, power, cooling, and fire-fighting hardware is packed in as well. All it needs from the outside world is a power source (25kW per rack) and a cooling-fluid hook-up (ordinary tap water).

As you walk into the container, with the fans whirring and the racks tight on either side, you feel as if you've walked into a meat locker. Though it's slightly warmer. And it smells better. And you know it's crunching data. Holding 2 quadrillion characters of information, the Wayback Machine processes 500 queries per second, and it's growing at a rate of four billion data rows per month.

Wayback Machine - spring-mounted racks

In case of earthquake, spring-mounts - but snug the bolts, lads

The rub is that this particular shipping container won't be shipped. The Wayback Machine will live at Sun forever - or least until IBM buys the company and pulls the plug. But the 20-foot container is another step towards Kahle's dream of a digital Alexandria capable of surviving a Caesarean fire - and most any other earthly disaster.

"Even if this first data center never moves, it encapsulates engineering efforts in a building that's reproducible," Kahle told us. "It's something that's centrally manufacturable and shippable."

Meanwhile, Google has built an internet archive of its own. "They're storing more than they let on," Kahle says. But the aims of Google's modular data center project are, shall we say, more commercial. ®

Photos and additional reporting by Rik Myslewski

Broader topics


Other stories you might like

  • Zuckerberg sued for alleged role in Cambridge Analytica data-slurp scandal
    I can prove CEO was 'personally involved in Facebook’s failure to protect privacy', DC AG insists

    Cambridge Analytica is back to haunt Mark Zuckerberg: Washington DC's Attorney General filed a lawsuit today directly accusing the Meta CEO of personal involvement in the abuses that led to the data-slurping scandal. 

    DC AG Karl Racine filed [PDF] the civil suit on Monday morning, saying his office's investigations found ample evidence Zuck could be held responsible for that 2018 cluster-fsck. For those who've put it out of mind, UK-based Cambridge Analytica harvested tens of millions of people's info via a third-party Facebook app, revealing a – at best – somewhat slipshod handling of netizens' privacy by the US tech giant.

    That year, Racine sued Facebook, claiming the social network was well aware of the analytics firm's antics yet failed to do anything meaningful until the data harvesting was covered by mainstream media. Facebook repeatedly stymied document production attempts, Racine claimed, and the paperwork it eventually handed over painted a trail he said led directly to Zuck. 

    Continue reading
  • Florida's content-moderation law kept on ice, likely unconstitutional, court says
    So cool you're into free speech because that includes taking down misinformation

    While the US Supreme Court considers an emergency petition to reinstate a preliminary injunction against Texas' social media law HB 20, the US Eleventh Circuit Court of Appeals on Monday partially upheld a similar injunction against Florida's social media law, SB 7072.

    Both Florida and Texas last year passed laws that impose content moderation restrictions, editorial disclosure obligations, and user-data access requirements on large online social networks. The Republican governors of both states justified the laws by claiming that social media sites have been trying to censor conservative voices, an allegation that has not been supported by evidence.

    Multiple studies addressing this issue say right-wing folk aren't being censored. They have found that social media sites try to take down or block misinformation, which researchers say is more common from right-leaning sources.

    Continue reading
  • US-APAC trade deal leaves out Taiwan, military defense not ruled out
    All fun and games until the chip factories are in the crosshairs

    US President Joe Biden has heralded an Indo-Pacific trade deal signed by several nations that do not include Taiwan. At the same time, Biden warned China that America would help defend Taiwan from attack; it is home to a critical slice of the global chip industry, after all. 

    The agreement, known as the Indo-Pacific Economic Framework (IPEF), is still in its infancy, with today's announcement enabling the United States and the other 12 participating countries to begin negotiating "rules of the road that ensure [US businesses] can compete in the Indo-Pacific," the White House said. 

    Along with America, other IPEF signatories are Australia, Brunei, India, Indonesia, Japan, South Korea, Malaysia, New Zealand, the Philippines, Singapore, Thailand and Vietnam. Combined, the White House said, the 13 countries participating in the IPEF make up 40 percent of the global economy. 

    Continue reading
  • 381,000-plus Kubernetes API servers 'exposed to internet'
    Firewall isn't a made-up word from the Hackers movie, people

    A large number of servers running the Kubernetes API have been left exposed to the internet, which is not great: they're potentially vulnerable to abuse.

    Nonprofit security organization The Shadowserver Foundation recently scanned 454,729 systems hosting the popular open-source platform for managing and orchestrating containers, finding that more than 381,645 – or about 84 percent – are accessible via the internet to varying degrees thus providing a cracked door into a corporate network.

    "While this does not mean that these instances are fully open or vulnerable to an attack, it is likely that this level of access was not intended and these instances are an unnecessarily exposed attack surface," Shadowserver's team stressed in a write-up. "They also allow for information leakage on version and build."

    Continue reading
  • A peek into Gigabyte's GPU Arm for AI, HPC shops
    High-performance platform choices are going beyond the ubiquitous x86 standard

    Arm-based servers continue to gain momentum with Gigabyte Technology introducing a system based on Ampere's Altra processors paired with Nvidia A100 GPUs, aimed at demanding workloads such as AI training and high-performance compute (HPC) applications.

    The G492-PD0 runs either an Ampere Altra or Altra Max processor, the latter delivering 128 64-bit cores that are compatible with the Armv8.2 architecture.

    It supports 16 DDR4 DIMM slots, which would be enough space for up to 4TB of memory if all slots were filled with 256GB memory modules. The chassis also has space for no fewer than eight Nvidia A100 GPUs, which would make for a costly but very powerful system for those workloads that benefit from GPU acceleration.

    Continue reading
  • GitLab version 15 goes big on visibility and observability
    GitOps fans can take a spin on the free tier for pull-based deployment

    One-stop DevOps shop GitLab has announced version 15 of its platform, hot on the heels of pull-based GitOps turning up on the platform's free tier.

    Version 15.0 marks the arrival of GitLab's next major iteration and attention this time around has turned to visibility and observability – hardly surprising considering the acquisition of OpsTrace as 2021 drew to a close, as well as workflow automation, security and compliance.

    GitLab puts out monthly releases –  hitting 15.1 on June 22 –  and we spoke to the company's senior director of Product, Kenny Johnston, at the recent Kubecon EU event, about what will be added to version 15 as time goes by. During a chat with the company's senior director of Product, Kenny Johnston, at the recent Kubecon EU event, The Register was told that this was more where dollars were being invested into the product.

    Continue reading

Biting the hand that feeds IT © 1998–2022