Facebook HipHop serves 70% more traffic on same hardware

PHP to C++ Diddy


When Facebook moved its servers to HipHop for PHP – the code transformer it built to convert PHP into optimized C++ – the company's average CPU usage dropped by 50 per cent. And after six months of additional engineering, the tool was about 1.8 times faster.

Now, after another six months, the company says, it has improved performance another 1.7 times. But this is more than just self-congratulation. The project is open source, and it has been open since Facebook first switched its servers to HipHop in February 2010.

Unlike Google, you see, Facebook has been known to promptly open-source some of the most important pieces of its back-end infrastructure.

With a Wednesday blog post, Facebook research scientist Xin Qi charts HipHop's relative throughput improvement over the last six months:

Facebook Hip Hop performance improvement

The end result, he says, is that HipHop can handle about 70 per cent more traffic on the same hardware infrastructure.

After transforming PHP into C++, HipHop compiles the code and builds binary files with the GNU C++ compiler, aka g++. The idea is that you can still code with high-level PHP, but then get the performance of C++ – though it does give up certain "rarely used" PHP features.

According to Qi, Facebook and the open source community have juiced the tool in several different ways. HipHop uses a version of the Alternative PHP Cache (APC), and engineers have stripped most of the serialization and unserialization operations. "Semantically, an object is serialized and unserialized when it is stored into and fetched from APC. However, serialization and unserialization are costly operations, commonly dominating the cost of APC data fetching itself," Qi says.

"Thus, we reworked the APC implementation in HipHop, getting rid of almost all the serialization/unserialization operations, while keeping the semantics equivalent to before."

But some serialization is still required, and this has been fine tuned. "Objects still need to be serialized or JSON-encoded in order to transfer them over the wire. To make things faster, we optimized various aspects of these operations, including UTF8/UTF16 conversions, object property accesses, number parsing, and so forth."

Facebook has also reduced the size of the binary code, improved memory allocation, and made several changes to the compiler. "Several phases in the compiler, including parsing, optimization, and code generation, are now parallelized. Hyves contributed changes to the generated C++ code to make it compile faster without losing any run-time efficiency," Qi says.

His crew can build a more than 1GB binary in about 15 minutes (after stripping out debug information). "Although faster compilation does not directly contribute to run-time efficiency," Qi says "it helps make the deployment process better."

The likes of Drupal, MediaWiki, and WordPress are now using HipHop. No one is using BigTable or the Second Coming of the Google File System. Except for Google. ®

Similar topics


Other stories you might like

  • AI tool finds hundreds of genes related to human motor neuron disease

    Breakthrough could lead to development of drugs to target illness

    A machine-learning algorithm has helped scientists find 690 human genes associated with a higher risk of developing motor neuron disease, according to research published in Cell this week.

    Neuronal cells in the central nervous system and brain break down and die in people with motor neuron disease, like amyotrophic lateral sclerosis (ALS) more commonly known as Lou Gehrig's disease, named after the baseball player who developed it. They lose control over their bodies, and as the disease progresses patients become completely paralyzed. There is currently no verified cure for ALS.

    Motor neuron disease typically affects people in old age and its causes are unknown. Johnathan Cooper-Knock, a clinical lecturer at the University of Sheffield in England and leader of Project MinE, an ambitious effort to perform whole genome sequencing of ALS, believes that understanding how genes affect cellular function could help scientists develop new drugs to treat the disease.

    Continue reading
  • Need to prioritize security bug patches? Don't forget to scan Twitter as well as use CVSS scores

    Exploit, vulnerability discussion online can offer useful signals

    Organizations looking to minimize exposure to exploitable software should scan Twitter for mentions of security bugs as well as use the Common Vulnerability Scoring System or CVSS, Kenna Security argues.

    Better still is prioritizing the repair of vulnerabilities for which exploit code is available, if that information is known.

    CVSS is a framework for rating the severity of software vulnerabilities (identified using CVE, or Common Vulnerability Enumeration, numbers), on a scale from 1 (least severe) to 10 (most severe). It's overseen by First.org, a US-based, non-profit computer security organization.

    Continue reading
  • Sniff those Ukrainian emails a little more carefully, advises Uncle Sam in wake of Belarusian digital vandalism

    NotPetya started over there, don't forget

    US companies should be on the lookout for security nasties from Ukrainian partners following the digital graffiti and malware attack launched against Ukraine by Belarus, the CISA has warned.

    In a statement issued on Tuesday, the Cybersecurity and Infrastructure Security Agency said it "strongly urges leaders and network defenders to be on alert for malicious cyber activity," having issued a checklist [PDF] of recommended actions to take.

    "If working with Ukrainian organizations, take extra care to monitor, inspect, and isolate traffic from those organizations; closely review access controls for that traffic," added CISA, which also advised reviewing backups and disaster recovery drills.

    Continue reading

Biting the hand that feeds IT © 1998–2022