Listen, Google and Amazon: Soon we may not even use tape for COLD STORAGE

Storage medium's resurgence may be temporary, says Chris Mellor


Opinion Tape's greatest strengths are its low cost/GB, low power cost when offline, endurance and capacity. Its greatest weaknesses are its latency, the time taken to locate and mount a tape in a library and the time taken to locate the data you want on it.

This combination of mount latency and streaming latency is like a red rag to the disk drive/array industry bull, a perpetual target. If only, it says, we could spin down down disks between data accesses then we would use no power per drive, just like an offline tape cartridge.

Copan tried this with its MAID (Massive Array of Idle Drives) technology and it failed, with Copan crashing and its technology bought up by SGI. Of course SGI still sells it, but into a market niche which needs access to archived data at lower latency than tape.

The thing is: Copan was right about the promise of MAID technology - only the disks weren't cheap enough.

With large capacity drives, the economics of spun-down disk arrays v tape libraries for lower latency access to archives could make more sense. We emphasise could as it depends how much you are willing to pay for the faster access, if there is a premium.

It's all about the money

We know Facebook's Open Compute Project (OPC) has a cold storage vault configuration using shingled magnetic recording drives. Both Google (mail backup and more) and Amazon (Glacier - see previous link) have tape vaults in their storage estate. El Reg storage desk thinks that they only had a choice between disk and spun-down standard disk drives, with the economics and latency acceptability driving them to tape, although neither would confirm this.

Shingled drives could change that equation because, probably, the cost/GB of a 6TB shingled drive is a lot less that that of a 4TB drive and, over, say, 500,000 drives, that saving turns into a big sum of dollars.

Now it could be that Seagate, which is pushing shingling hard, feeling the hot draught from HGSTs helium-filled drives on its back, is pushing shingled drive use here because, let's face it, shingled drives are OK for reading but really poor at re-writing. Bulk archive storage with fixed content seems a pretty good fit for shingled drives.

So Seagate is pushing EVault, its cloud backup subsidiary, to come up with a shingled-drive-based cold data storage service. Incidentally, we now know a little bit more about that EVault project which El Reg first described here.

EVault Open Storage project?

Our understanding, based on clues scattered around the web and connecting the dots is this:

  1. EVault has an Open Storage project using SMR disk drives with a cloud storage service for cold data based on an object store and using OCP-style hardware (Facebook Open Compute Project).
  2. It is known as LTS 2 for Long Term Storage number 2, and is based on OpenStack.
  3. Mikey Butler, EVAult's IaaS VP for research and dev, and a member of its Open Storage team, has written extensively about its background here. Points made in that April 2013 blog include these:
    • The world of data is moving rapidly toward highly complex, multi-format objects and object stores.
    • We’ve been taking a good, hard look at the problem of holding digital data for the long haul.
    • It's to do with disk storage: Storage Density–Maximize spindle/CPU ratio, the purer the storage play, the better. [He says "spindle" and spindle equals disk.]
    • Power Conservation – The #1 OPEX driver by far, up to 70 per cent of ongoing cost.
    • Open Systems – Open source software and [Facebook] OCP hardware are the most cost-effective.
    • The OpenStorage project has the energy of a startup and yet is backed by a multi-billion dollar corporation, [EVAult's parent company] Seagate.
  4. EVault's exec bio pages describe Butler thus:
    Mikey Butler, vice president of research and development, anchors a team of software engineers responsible for creating the company’s industry-leading, next-generation Infrastructure as a Service (IaaS) offering.
  5. EVault could be planning to market Open Storage into the media and entertainment market; witness this job posting. [Los Angeles: Senior Product Marketing Manager, Media and Entertainment: "We are seeking a Sr Product Marketing Manager to manage outbound marketing of our Open Storage offering in the media and entertainment vertical".]
  6. George Hoening is VP for Open Storage in EVault. When EVAult joined OpenStack in March this year, the release stated: "EVault plans to build upon OpenStack, and to work closely with Seagate to design and deliver reliable, cost-effective, and simple to use cloud storage services.”

    Plus this: "[EVault] intends to focus its contributions around OpenStack Swift. Swift is a highly available, distributed object store that organisations can use to store large amounts of data efficiently, safely, and inexpensively."

    Hoening recently commented: "Swift's fully distributed architecture allows for extreme scalability. Its redundant and self-healing properties provide for enterprise class durability while simultaneously keeping storage management costs to a minimum.”

  7. EVault now has a senior director for open cloud storage named Amar Kapadia. It also now has an Open Storage director of engineering.
  8. Amazon's S3 API may be supported. Kapadia has blogged: "EVault is active in the Swift3 effort to provide an AWS S3 compatibility layer on top of Swift. At least in the storage use case, we see value in offering both the Swift API (allows innovation) and S3 API (allows interoperability with existing applications)."

Your correspondent thinks that an EVault Open Storage archive service might well be announced within the next six months, and may use arrays of mostly spun-down 6TB Seagate shingled drives to provide a cloud-based object store with Glacier-level pricing and faster-than-Glacier restores. We already asked the company about this. Its reply was: "EVault has no comment on this," and we have no comment on that.

Leaving tape where exactly?

If, and it's a big "if", spun-down disk-based cold storage gets used for archive data needing low latency retrieval, then tape gets left with the archival data where retrieval latency is not that important.

But, and it's a big "but", that's only true if a tape archive is cheaper than a spun-down disk archive. If cold storage on disk costs virtually the same as cold storage on tape but retrieval is faster then why wouldn't you choose disk?

You would need to be certain that data longevity and integrity on disk was the same or better than disk. And if it was then the conclusion would be inescapable:

Tape is dead.

®

Similar topics


Other stories you might like

  • US-APAC trade deal leaves out Taiwan, military defense not ruled out
    All fun and games until the chip factories are in the crosshairs

    US President Joe Biden has heralded an Indo-Pacific trade deal signed by several nations that do not include Taiwan. At the same time, Biden warned China that America would defend Taiwan from attack; it is home to a critical slice of the global chip industry, after all. 

    The agreement, known as the Indo-Pacific Economic Framework (IPEF), is still in its infancy, with today's announcement enabling the United States and the other 12 participating countries to begin negotiating "rules of the road that ensure [US businesses] can compete in the Indo-Pacific," the White House said. 

    Along with America, other IPEF signatories are Australia, Brunei, India, Indonesia, Japan, South Korea, Malaysia, New Zealand, the Philippines, Singapore, Thailand and Vietnam. Combined, the White House said, the 13 countries participating in the IPEF make up 40 percent of the global economy. 

    Continue reading
  • 381,000-plus Kubernetes API servers 'exposed to internet'
    Firewall isn't a made-up word from the Hackers movie, people

    A large number of servers running the Kubernetes API have been left exposed to the internet, which is not great: they're potentially vulnerable to abuse.

    Nonprofit security organization The Shadowserver Foundation recently scanned 454,729 systems hosting the popular open-source platform for managing and orchestrating containers, finding that more than 381,645 – or about 84 percent – are accessible via the internet to varying degrees thus providing a cracked door into a corporate network.

    "While this does not mean that these instances are fully open or vulnerable to an attack, it is likely that this level of access was not intended and these instances are an unnecessarily exposed attack surface," Shadowserver's team stressed in a write-up. "They also allow for information leakage on version and build."

    Continue reading
  • A peek into Gigabyte's GPU Arm for AI, HPC shops
    High-performance platform choices are going beyond the ubiquitous x86 standard

    Arm-based servers continue to gain momentum with Gigabyte Technology introducing a system based on Ampere's Altra processors paired with Nvidia A100 GPUs, aimed at demanding workloads such as AI training and high-performance compute (HPC) applications.

    The G492-PD0 runs either an Ampere Altra or Altra Max processor, the latter delivering 128 64-bit cores that are compatible with the Armv8.2 architecture.

    It supports 16 DDR4 DIMM slots, which would be enough space for up to 4TB of memory if all slots were filled with 256GB memory modules. The chassis also has space for no fewer than eight Nvidia A100 GPUs, which would make for a costly but very powerful system for those workloads that benefit from GPU acceleration.

    Continue reading

Biting the hand that feeds IT © 1998–2022