Facebook rolls out new web and database server designs

Old photos, like revenge, are best served cold


Open Compute 2013 Facebook, the company behind the founding of the Open Compute Project that is opening server, storage, rack, and data center designs, gave a sneak peek this week at some new models that it is working on and could donate to the OCP cause.

At the Open Compute Summit in Santa Clara this week, Frank Frankovsky,vice president of hardware design and supply chain at Facebook and also chairman of the Open Compute effort, showed off a few server designs as well as talking up some microserver standards that it has established to make it possible to mix and match different processor architectures on the same backplane and in the same chassis.

The first new server coming out of Facebook is code-named "Dragonstone" and the specs for it are available out there on the Open Compute site unlike some of the other designs that were shown off this week.

According to Frankovsky, for certain database functions at Facebook, it was more important to have redundant power supplies for a database node than it was to have multiple compute nodes in an Open Compute V2 chassis sharing a single power supply. (This chassis and its related Xeon and Opteron server nodes were divulged in May 2012 and contributed to the Open Compute Project, and they are the servers that are in use in Facebook's data center in Forest City, North Carolina.)

The Open Compute V2 chassis uses server nodes called "Windmill" based on two-socket Xeon E5-2600 processors from Intel and another two-socketeer called "Watermark" that are based on the Opteron 6200 and now the Opteron 6300 processor from Advanced Micro Devices.

The mechanical drawing of the Dragonstone server

The mechanical drawing of the Dragonstone server

As you can see from the mechanical drawing above, the Dragonstone server has a two-socket server node on the left, a redundant power supply in the middle, and then space for 3.5-inch disk drives or flash storage from Fusion-io in a storage sled on the right.

This particular machine is based on the Intel Windmill board, and redundant power supplies from two different, er, suppliers – Power One and Delta – have been certified to fit on the middle tray and feed the server node and the storage. Fusion-io has come up with a 3.2TB flash storage card that hooks into a PCI-Express 2.0 x8 slot (it has ten flash modules) that is being used on Dragonstone. This card will be commercialized as the ioScale enterprise flash by Fusion-io, which has contributed the mechanical design of this card to Open Compute so others can implement it in OCP systems.

Frankovsky said that by doubling up the power supplies and making an Open Compute-style database server, it was able to cut the costs over its current database servers by 40 per cent. (He did not say what that prior database server was and if it used flash or disk storage.) This Dragonstone server is being installed in Facebook's third data center, which is located in Lulea, Sweden.

The Winterfell server designed by Facebook will eventually be contributed to the Open Compute cause, but its specs are not yet available.

The three-node Winterfell server chassis from Facebook

The three-node Winterfell server chassis from Facebook

The Winterfell machine is Facebook's latest Web server design and slides three x86 servers into the three bays of the Open Compute chassis. Not much more is known about it at this point, but clearly three servers in a 1.5U chassis is better than two.

When you have more than 1 billion users and make billions of dollars peddling ads to them, you not only have some unique needs but you can indulge in engineering your systems and data centers to specifically meet those needs. By doing so – as Facebook fully understands in ways that most companies do not and as Google and Amazon and a few others do – you control the experience that users have and the costs that you incur providing that experience.

We used to live in a world where there were those who could afford the high availability and high throughput of mainframes and the rest of us had to cobble together networks of systems based on RISC/Unix or then x86 servers that mimicked as best they could some of the aspects of a mainframe.

We are now entering a world where some companies not only can indulge in custom engineering for their systems, data centers and software, but their very business demands it, while other companies will do the best they can with a mix of third party systems and application software and "engineered systems" with converged servers, storage, and networking.

Facebook's Dragonstone database server design

Facebook's Dragonstone database server design

The rest of us get what we can afford, or we get whatever Facebook and its friends provide through the Open Compute Project if we can afford to indulge in custom servers.

Google has been generous with the software ideas, proving to the rest of the world that certain things could be done to manage big data and providing insights that have driven others to mimic its advances without actually releasing its code out into the world as open source. Google opens up the idea, but not the technology, and ditto for its own custom servers and data center designs.

Facebook has been generous with its Cassandra NoSQL data store as well as with the system and data center designs from Open Compute, and now third parties are starting to work with ODMs to make their own custom iron.

Jay Parikh, vice president of infrastructure at Facebook, talked quite a bit yesterday about the challenges that the social network is having storing the more than 240 billion photos on the site, which is growing by 350 million pictures per day.

That works out to an additional 7PB in the Facebook Photo data store every month, and obviously, you can't do that on an expensive storage area network and it would even be an economic challenge on the bare-bones Open Compute servers and Open Vault storage arrays that Facebook has already designed and put into production.

The rack and server design for Facebook's cold storage

The rack and server design for Facebook's cold storage

If Facebook wants you to store all your photos on its site – and therefore have other people coming and looking at them, thus generating traffic and therefore ad money – then it can't ever lose a photo and it has to preserve the quick response time of its web farms. It cannot, as Parikh explained, just throw old photos out there on tape and tell you to come back in a day to see them.

What Facebook can do is use hierarchical storage management, albeit a homegrown variant that is, as you would expect from uber-nerds, kinda clever. The important thing to do first was to analyze its own data, and as you might expect, as photos age, they are accessed less.

In the Facebook pool, 82 per cent of the traffic of retrieving photos is actually only across 8 per cent of the stored capacity. And that means you don’t have to keep the other 92 per cent of the photos in the cache and storage vaults that are close to the web servers. You can put them in an different part of the data center on a different kind of storage server.

The data center that Facebook has built to test out its ideas has 1EB (that's Exabyte) of capacity and 1.5 megawatts per room, and there is no redundant electrical system at all because Facebook is trying to cut back on power consumption for photo storage.

The cold storage service uses Reed-Solomon encoding and checksum and spreads the bits that comprise a photo over multiple server nodes and – here's the tricky bit – only one drive per server. While the server node has many disk drives, only one drive in the machine is powered up at any time in an array of servers, and that is only when they are accessing the node to get a specific photo.

The demand for old photos is so small and powering drives up and down is so fast that this causes only a slight delay over the network. And if a drive in the array of servers fails, the data can be reconstructed by running the Reed-Solomon encoding in reverse.

The resulting server, which has not yet been contributed to Open Compute but could be, has 2PB of storage per rack, which is eight times the storage density of Facebook's current storage servers, and burns only 2 kilowatts per rack because at any given time, most of the disk drives are turned off. The server nodes have 10 Gigabit Ethernet links coming into them and the rack has a 40GE pipe going back out to the main storage of the Facebook site so once a photo is found it is piped out right quick to a web page.

The resulting setup provides storage at one third the cost of the prior generation of Open Vault storage arrays and the data center housing these cold storage racks is one-fifth the cost of the conventional data centers build by Facebook.

It looks like Facebook is willing to take the chance that one of these storage rooms could fail in its data center and that you won't gripe too much if it does as long as it comes back up and your photos are still there. That's probably a safe bet, considering what you pay to use Facebook. ®

Similar topics


Other stories you might like

  • Consultant plays Metaverse MythBuster. Here's why they're wrong
    Holograms, brands, NFTs, and a 1,000-consumer survey

    Opinion Consulting giant McKinsey & Company has been playing a round of MythBusters: Metaverse Edition.

    Though its origins lie in the 1992 sci-fi novel Snow Crash, the metaverse has been heavily talked about in business circles as if it's a real thing over the last year or so, peaking with Facebook's Earth-shattering rebrand to Meta in October 2021.

    The metaverse, in all but name, is already here and has been for some time in the realm of online video games. However, Meta CEO Mark Zuckerberg's vision of it is not.

    Continue reading
  • Meta agrees to tweak ad system after US govt brands it discriminatory
    And pay the tiniest of fines, too

    Facebook parent Meta has settled a complaint brought by the US government, which alleged the internet giant's machine-learning algorithms broke the law by blocking certain users from seeing online real-estate adverts based on their nationality, race, religion, sex, and marital status.

    Specifically, Meta violated America's Fair Housing Act, which protects people looking to buy or rent properties from discrimination, it was claimed; it is illegal for homeowners to refuse to sell or rent their houses or advertise homes to specific demographics, and to evict tenants based on their demographics.

    This week, prosecutors sued Meta in New York City, alleging the mega-corp's algorithms discriminated against users on Facebook by unfairly targeting people with housing ads based on their "race, color, religion, sex, disability, familial status, and national origin."

    Continue reading
  • DRAM prices to drop 3-8% due to Ukraine war, inflation
    Wait, we’ll explain

    As the world continues to grapple with unrelenting inflation for many products and services, the trend of rising prices is expected to have the opposite impact on memory chips for PCs, servers, smartphones, graphics processors, and other devices.

    Taiwanese research firm TrendForce said Monday that DRAM pricing for commercial buyers is forecast to drop around three to eight percent across those markets in the third quarter compared to the previous three months. Even prices for DDR5 modules in the PC market could drop as much as five percent from July to September.

    This could result in DRAM buyers, such as system vendors and distributors, reducing prices for end users if they hope to stimulate demand in markets like PC and smartphones where sales have waned. We suppose they could try to profit on the decreased memory prices, but with many people tightening their budgets, we hope this won't be the case.

    Continue reading

Biting the hand that feeds IT © 1998–2022