Opscode guts Chef control freak to scale it to 10,000 servers

Facebook likes – and uses – Chef, just like Amazon and Google

Opscode is in a race with Puppet Labs to become a next-generation management tool, and its latest Chef product, which does configuration, change, and cloud management, is used by some of the name-brand hyperscale cloud application operators out there. As part of the launch of the Chef 11 tool, Facebook is outing itself as a customer, joining the ranks of Amazon and Google, and tens of thousands of other IT shops of all shapes and sizes, which already use code.

"Chef was the only automation solution flexible enough to bend to our scale dynamics without requiring us to change our workflow," Phil Dibowitz, production engineer at Facebook, is quoted as saying in the Chef 11 presentation that El Reg was given as part of its briefing on the new tools. "Private Chef's basis on open-source Chef also aligns with our own open philosophy allowing us to contribute back to the greater Chef community."

That's a pretty big endorsement, and in fact, Christopher Brown, CTO at Opscode, tells El Reg that the demands of Facebook and other large-scale Web operators is why the techies at Opscode changed the back-end system and database to make Chef 11 a lot more scalable than the prior release.

"Chef has been rebuilt from the ground up with Chef 11," explains Brown, who was brought in from Amazon expressly to do this reconstruction.

Brown was formerly the architect and lead developer of Amazon Web Services' foundational EC2 compute cloud and did a stint at as Microsoft's director of engineering for edge computing networks. As the CTO at Opscode, Brown works alongside Adam Jacob, the creator of Chef and chief customer officer at the company he co-founded more out of frustration with existing physical and virtual infrastructure management tools than out of a desire to start (another) business.

Chef creates what are called recipes to configure machines and cookbooks to manage a server's entire software stack. Unlike any good cook, it follows the recipe religiously every time without changes and is able to share recipes and therefore the means of passing the knowledge of how to configure a specific stack for a specific set of iron with other users of Chef or even outside the company walls.

Perhaps they should have called it Anti-Chef, because most real chefs mess with recipes and many don't share their secrets. But I digress. (Less than I used to, at least.)

Chef comes in three flavors. Opscode Chef is the open source code you can download and use. Private Chef is the enterprise edition with some extra features and tech support services behind it.

Then there's Hosted Chef is a version of Private Chef that Opscode runs on your behalf and, not accidentally, is used by a large number of customers at the same time and allows Opscode to test scalability limits and other features in real-time before finalizing them in each release. Facebook is using Private Chef, but it is benefiting from some of the work that was done in Hosted Chef to boost scalability.

The first thing to change with Chef 11 was the back end data store, which has been running on the CouchDB. That NoSQL data store is an Apache-licensed open source database that is coded in Erlang and that was created by Damien Katz, who worked on the Lotus Notes/Domino team at IBM.

The company was seeing how the Riak distributed database from Basho and the Cassandra NoSQL data store created by Facebook as its back end, but oddly enough Opscode is moving away from NoSQL and towards real SQL and has chosen the PostgreSQL relational database management system as the new back end.

"We took a look at a number of different data stores, and the modeling in a relational database was a better fit," says Brown. The back-end database has the open source Solr search engine bolted on to make elements of recipes searchable.

The Chef 10 server was written in Ruby with the Unicorn web server and the Merb framework, but this time around Chef 11 is written in Erlang, which Brown said was "highly concurrent and reliable" and hence a good choice.

CouchDB is also written in Erlang, and incidentally, with Couchbase Server, which is a derivative product done by a company called Couchbase where Katz now works, the NoSQL data store was ported from Erlang to C.

Anyway, the Erlang-based API stack at the heart of Chef 11 has an order of magnitude reduction in memory footprint compared to the Ruby version in Chef 10.

The upshot of all of these changes is that Chef 11 can manage up to 10,000 nodes from a single server, which is a factor of four more than the Chef 10 server could handle.

And the "Omnibus" installer, which was only available in the Private Chef enterprise edition or its Hosted Chef variant, in now much-improved and available on the open source version of Chef. This installer can put Chef agents on Windows and Linux servers quickly as well as on AWS, Rackspace Cloud, Google Compute Engine, Microsoft Azure, and other cloudy infrastructure. Opscode is now using the Pedant Test Suite to ensure that Chef works well against seven different Windows Server variants.

The new Opscode system control freak also has better change modeling, which allows you to see the effects of changes on the infrastructure before you cascade them over the physical and virtual servers.

In fact, Opscode is so confident in the open source Chef 11 version of tool and its services organization's ability to handle a volume of calls that it will now provide tech support services on the open source Chef for the first time along with the commercially supported Private Chef and Hosted Chef variants. Standard business hour support for the open source Chef 11 costs $3 per node per month, and premium 24x7 support costs $3.75 per node per month.

Private Chef and its hosted variant continues to have some goodies not in the open source version, since Opscode has to make some money somehow to appease its investors. This includes a graphical user interface (completely rewritten for the 11 release) to visualize and navigate those 10,000 nodes under management and an activity reporting dashboard to show historical and current data for nodes under management.

The Private and Hosted Chefs also have an on-demand command execution function called Push that, as the name suggests allows for admins to edit and execute code in real time on systems and to do so on thousands of nodes at the same time if need be.

Or you might use Push to scheduling compliance reporting or log polling for systems on portions of a cluster on a rolling basis rather than all at once. Private Chef also has role-based access controls and multi-tenancy to allow multiple system admins to manage nodes from the same Chef server.

With the Chef 11 launch, Opscode is shifting away from perpetual licensing to subscription pricing that is consistent with the support pricing it now has on the open source variant. Both Private Chef and Hosted Chef cost $6 per node under management per month.

One of the main rivals that Opscode has among new management tool vendors that are designed for the hyperscale era is Puppet Labs, which just took in a $30m bag of cash from VMware.

Puppet Labs has raised $45.5m in four rounds of funding, and Opscode has raised $33m in three rounds of funding from Ignition Partners, Battery Ventures, and Draper Fisher Jurvetson. Both are getting traction because they are designed specifically for modern hyperscale infrastructure.

"We're seeing enterprises adopt hyperscale infrastructure, but there are some changes that have to take place," explains Jay Wampold, director of marketing at Opscode. "They have to change their tooling, of course, but we are also seeing IT shift its focus from being a back-office function to being a front-office function to deliver services to customers and users."

As for rival Puppet Labs getting a big bag of cash from VMware, Opscode is not worried. "I think it is great for the space and it is a great validation for the next generation of management tools," says Wampold.

On this, VMware and Puppet Labs would no doubt agree. ®

Other stories you might like

  • It's 2022 and there are still malware-laden PDFs in emails exploiting bugs from 2017
    Crafty file names, encrypted malicious code, Office flaws – ah, it's like the Before Times

    HP's cybersecurity folks have uncovered an email campaign that ticks all the boxes: messages with a PDF attached that embeds a Word document that upon opening infects the victim's Windows PC with malware by exploiting a four-year-old code-execution vulnerability in Microsoft Office.

    Booby-trapping a PDF with a malicious Word document goes against the norm of the past 10 years, according to the HP Wolf Security researchers. For a decade, miscreants have preferred Office file formats, such as Word and Excel, to deliver malicious code rather than PDFs, as users are more used to getting and opening .docx and .xlsx files. About 45 percent of malware stopped by HP's threat intelligence team in the first quarter of the year leveraged Office formats.

    "The reasons are clear: users are familiar with these file types, the applications used to open them are ubiquitous, and they are suited to social engineering lures," Patrick Schläpfer, malware analyst at HP, explained in a write-up, adding that in this latest campaign, "the malware arrived in a PDF document – a format attackers less commonly use to infect PCs."

    Continue reading
  • New audio server Pipewire coming to next version of Ubuntu
    What does that mean? Better latency and a replacement for PulseAudio

    The next release of Ubuntu, version 22.10 and codenamed Kinetic Kudu, will switch audio servers to the relatively new PipeWire.

    Don't panic. As J M Barrie said: "All of this has happened before, and it will all happen again." Fedora switched to PipeWire in version 34, over a year ago now. Users who aren't pro-level creators or editors of sound and music on Ubuntu may not notice the planned change.

    Currently, most editions of Ubuntu use the PulseAudio server, which it adopted in version 8.04 Hardy Heron, the company's second LTS release. (The Ubuntu Studio edition uses JACK instead.) Fedora 8 also switched to PulseAudio. Before PulseAudio became the standard, many distros used ESD, the Enlightened Sound Daemon, which came out of the Enlightenment project, best known for its desktop.

    Continue reading
  • VMware claims 'bare-metal' performance on virtualized GPUs
    Is... is that why Broadcom wants to buy it?

    The future of high-performance computing will be virtualized, VMware's Uday Kurkure has told The Register.

    Kurkure, the lead engineer for VMware's performance engineering team, has spent the past five years working on ways to virtualize machine-learning workloads running on accelerators. Earlier this month his team reported "near or better than bare-metal performance" for Bidirectional Encoder Representations from Transformers (BERT) and Mask R-CNN — two popular machine-learning workloads — running on virtualized GPUs (vGPU) connected using Nvidia's NVLink interconnect.

    NVLink enables compute and memory resources to be shared across up to four GPUs over a high-bandwidth mesh fabric operating at 6.25GB/s per lane compared to PCIe 4.0's 2.5GB/s. The interconnect enabled Kurkure's team to pool 160GB of GPU memory from the Dell PowerEdge system's four 40GB Nvidia A100 SXM GPUs.

    Continue reading
  • Nvidia promises annual updates across CPU, GPU, and DPU lines
    Arm one year, x86 the next, and always faster than a certain chip shop that still can't ship even one standalone GPU

    Computex Nvidia's push deeper into enterprise computing will see its practice of introducing a new GPU architecture every two years brought to its CPUs and data processing units (DPUs, aka SmartNICs).

    Speaking on the company's pre-recorded keynote released to coincide with the Computex exhibition in Taiwan this week, senior vice president for hardware engineering Brian Kelleher spoke of the company's "reputation for unmatched execution on silicon." That's language that needs to be considered in the context of Intel, an Nvidia rival, again delaying a planned entry to the discrete GPU market.

    "We will extend our execution excellence and give each of our chip architectures a two-year rhythm," Kelleher added.

    Continue reading
  • Amazon puts 'creepy' AI cameras in UK delivery vans
    Big Bezos is watching you

    Amazon is reportedly installing AI-powered cameras in delivery vans to keep tabs on its drivers in the UK.

    The technology was first deployed, with numerous errors that reportedly denied drivers' bonuses after malfunctions, in the US. Last year, the internet giant produced a corporate video detailing how the cameras monitor drivers' driving behavior for safety reasons. The same system is now apparently being rolled out to vehicles in the UK. 

    Multiple camera lenses are placed under the front mirror. One is directed at the person behind the wheel, one is facing the road, and two are located on either side to provide a wider view. The cameras are monitored by software built by Netradyne, a computer-vision startup focused on driver safety. This code uses machine-learning algorithms to figure out what's going on in and around the vehicle.

    Continue reading

Biting the hand that feeds IT © 1998–2022