Google has shared a White Paper (PDF) in which it calls for major revisions to disk drive design.
Titled “Disks for Data Centers”, the paper is unashamedly Google-centric inasmuch as it calls for disk-makers to rethink their products to suit the ad giant's needs. As the paper explains, those needs are very substantial: just YouTube requires a petabyte of new storage every day.
The paper says Google meets that requirement with disks designed for servers, not cloud-scale storage. The document also argues that the rise of cloud computing means cloud-specific disks need to be invented sooner rather than later, because plenty of disk will now never see a conventional server.
“The industry is relatively good at improving GB/$ [gigabytes per dollar], but less so at IOPS/GB [input-output-per-second per gigabyte],” the paper says. A desire to see the industry improve both informs Google's shopping list for a dream cloud disk.
First item on Google's wish list is new form factors. Taller drives make a lot of sense to Google as such devices “... allow for more platters per disk, which adds capacity, and amortizes the costs of packaging, the printed circuit board, and the drive motor/actuator.” The paper also wonders if mixing platter sizes inside a disk could help: the smaller platters would offer poor GB/$, but excellent IOPS/GB, but that would be balanced by the presence of larger and higher-capacity platters.
Later in the paper Google also floats the idea that individual disks could mix shingled magnetic recording and conventional recording technologies, so that disks could get the best of both worlds. The result could be a disk suited to both fast writes for transactional workloads and archival storage. All within the same chassis.
Disks with more than one IO source is another idea Google wants realised. The paper imagines disks with more than one actuator arm, or one arm capable of reading more than one track at a time.
Google's third idea is “group disks”, pre-packaged clusters of disk that are nowhere near as smart as a NAS device but offer fine low-level control of disks. Each individual disk in the group might still have its own SATA or PCI-E interface, but Google thinks buying them in clumps with some shared components might be cheaper.
Lastly, on the form factor front, Google calls for a standard 12v DC power supply to electrify disks.
Google's next idea is to strip the cache out of disks and centralise it. “From a TCO perspective it makes more sense to move RAM caching from the disks to the host or tray, as a single big cache will be both cheaper and more effective,” the authors suggest.
Plenty of suggestions concern how to make disks more efficient at the many background tasks they need to perform to secure data. Google seems to want those tasks made lower priority, so that the disk can spend more time doing productive I/O. Spreading data across multiple redundant disks picks up the slack on the data protection front.
There's also a call for disks to understand the kind of IO demands a host makes and to be able to change the stream of data they offer to suit the times. Google wants APIs to make this happen.
Interestingly, many of the considerations above concern spinning rust, the disk medium commonly assumed to be on the way out as solid state disk gets cheaper. Google says it thinks magnetic disks will be around for years, at least in its data centres, because “the growth rates in capacity/$ between disks and SSDs are relatively close (at least for SSDs that have sufficient numbers of program-erase cycles to use in data centers), so that cost will not change enough in the coming decade.”
Which probably means better wear-levelling and durability for SSDs are also on Google's wish list. ®