Promo Infrastructure as code is a buzzword frequently thrown out alongside DevOps and continuous integration as being the modern way of doing things. Proponents cite benefits ranging from an amorphous "agility" to reducing the time to deploy new workloads. I have an argument for infrastructure as code that boils down to "cover your ass", and have discovered it's not quite so difficult as we might think.
Recently, a client of mine went through an ownership change. The new owners, appalled at how much was being spent on IT, decided that the best path forward was an external audit. The client in question, of course, is an SMB who had been massively under-spending on IT for 15 years, and there no way they were ready for – or would pass – an audit.
Trying to cram eight months' worth of migrations, consolidations, R&D, application replacement and so forth into four frantic, sleepless nights of panic ended how you might imagine it ending. The techies focused on making sure their asses were covered when the audit landed. Overall network performance slowed to a crawl and everyone went home angry.
Why desired state configurations matter
None of this is particularly surprising. When you have an environment where each workload is a pet, change is slow, difficult, and requires a lot of testing. Reverting changes is equally tedious, and so a lot of planning goes into making sure than any given change won't cascade and cause knock-on effects elsewhere.
In the real world this is really the result of two unfortunate aspects of human nature. First: everyone hates doing documentation, so it's highly unlikely that in an unstructured environment every change from the last refresh was documented. The second driver of chaos and problems is that there are few things more permanent than a temporary fix.
When you don't have the budget for the right hardware, software or services you make do. When something doesn't work you "innovate" a solution. When that breaks something, you patch it. You move from one problem to the next, and if you’re not careful, you end up with something so fragile that if you breathe on it, it falls over. At this point, you burn it all down and restart from scratch.
This approach to IT is fine - if you have 5, 10 or even 50 workloads. A single techie can reasonably be expected to keep that all in their head, know their network and solve any problems they encounter. Unfortunately, 50 workloads is today restricted to only the smallest of shops. Everyone else is juggling too many workloads to be playing the pets game any more.
Most of us use some form of desired state solution already. Desired state solutions basically involve an OS agent that gets a config from a centralized location and applies the relevant configuration to the operating system and/or applications. Microsoft's group policy can be considered a really primitive version of this, with System Center being a more powerful but miserable to use example. The modern friendly tools being Puppet, Chef, Saltstack, Ansible and the like.
Once you have desired state configs in place we're no longer beating individual workloads into shape, or checking them manually for deviation from design. If all does what it says on the tin, configurations are applied and errors thrown if they can't be. Usually there is some form of analysis software to determine how many of what is out of compliance. This is a big step forward.
Having the ability to centralize some or all of your IT configuration is only the start of covering your backside. Desired state config tools, on their own, only tell you that your workloads are behaving according to the configs supplied. They don't explain why your configs are what they are.
With luck, when we architect our networks, everything makes sense. There is a self-evident reason for why everything is designed the way it is. On paper, it's logical, rational, the sort of thing you'd have no problem standing up in front of a judge and claiming ownership of.
Networks rarely stay in that pristine state for long. Change is a constant in life, and this is where versioning tools come in. If you use text-based desired state config solutions like Puppet then Git is your best friend. Set it up right, and every change made is recorded. How you got from your pristine design document to the series of compromises you ended up with can be traced back. If you're smart, you added some comments along the way to explain why each compromise was made.
Somewhat useful if you get audited.
Infrastructure as code
Full blown infrastructure as code moves beyond this. Data storage gets separated from the environment and applications. In many cases one no longer has to wrap applications up in a protective VM shield, but can let them operate in lightweight and feature-poor containers.
More importantly, a proper infrastructure as code implementation could spin up a complete copy of a data centre from only the original installers and backups of the data. Trigger a script, and a VM or container is provisioned. An operating system, agents, applications and configurations are injected into the new environment. Data storage is provisioned and attached. A workload is born!
Infrastructure as code requires having infrastructure that can be addressed by code. Today, that usually means REST APIs, but older solutions are still in use. For physical hosts, this means baseband management controllers that can allow automation of firmware updates, and injection of hypervisors and microvisors. The hypervisors, microvisors, storage and networking all need management layers that can be addressed by code. And you're going to need something that hoovers up logs and alerts for when things go sideways.
Setting all that up for a single workload seems like – and is – a lot of work. It makes a lot more sense when we talk hundreds or thousands of workloads. It also makes responding to audits easier: with the right tools, procedures and practices in place audits can be as simple as pulling reports off of existing monitoring and analytics software.
So infrastructure as code isn't just some mumbo jumbo about lowering costs or getting workloads out faster. Far more practically, it's covering your ass in real time, ready for whatever might - just might - happen next.
This article is sponsored by HPE.