How to automate your way out of cloud shock

CAST AI autoscaling and AI-based provisioning radically reduces bills


Sponsored Technology is supposed to make everything easier, but often doesn't. The internet was meant to be a democratizing force, but now large swathes of the population spend their hours on Facebook worrying why no one liked their cat memes. The smartphone was supposed to make life more convenient, and it did… for companies to track us and get us hooked on Candy Crush.

Technology isn't the problem. It's us. We embrace it without understanding how to use it properly. The same goes for the cloud. Vendors preached the benefits of more efficient resource allocation. No idle CPU cycles, they said. Pay only for what you need and move everything from capex to opex, they said. It'll be easy, they said.

A lightning shock from the cloud

In reality, the cloud is problematic for many companies. Last year, Flexera found that nearly a third of all cloud expenditure evaporated thanks to inefficiencies. As we reported at the time, the cloud costs companies more than they expect when they get there. This phenomenon is called cloud shock, and it arrives with the first monthly bill.

In practice, the magical cost savings take some work to accomplish - and dare we say it, the big three service providers have little incentive to make it easy. Companies must know what they're doing to use the cloud efficiently. Key technical decisions are pushed down to individual engineers who are left holding the bag.

If engineers don't configure workloads properly, companies end up provisioning them the same way that they did under their old on-premises practices. It's like the infrastructure upgrade arrived, but the working model never got updated. This leaves IT teams building in more capacity than they need, which sits idle and sucks up their budget.

Cloud service providers respond with reserved instance pricing. Reserved instances are an alternative to the on-demand instances that you can rent by the hour as you need them.

You set aside reserved instances in advance, committing the money up front. In return, the service provider will give you a discount. The downside is that this locks you into a contract, which isn't flexible. It's also incompatible with cloud economics, which constantly pushes pricing down. It penalizes you for your inability to configure your cloud workloads properly.

There's another problem plaguing cloud customers: Not all clouds are equal. Three large cloud service providers (CSPs) own 59 percent of the market and are eager to grab each other's share, which means differentiating themselves somehow.

So, each CSP does things differently. For example, AWS, Azure and Google handle auto-scaling in very different ways.

That creates a massive learning curve for developers who want to learn the intricacies of each environment. This skills problem further hampers customers' ability to use resources efficiently, especially if you're considering a multi-cloud solution.

You can't abstract it all way

Abstraction might one day solve this problem, but there's a lot of work to do. Abstraction is technology's governing principle. Technology tends to hide complexity from users. We see it in programming (think machine code -> assembly -> C -> Python -> low-code). Now it's happening in infrastructure, which initially abstracted software from hardware with virtual machines and is now refining the concept with containers.

Kubernetes aims to take the next step, abstracting the management of container-based workloads from the cloud infrastructure underneath. It uses pods that you can replicate to scale your software services. This system might solve the auto-scaling problem by managing it all automatically, scaling servers-as-cattle en masse. All they'll need to do is learn the same cloud-native concepts that they can port to any cloud infrastructure.

Today, though, Kubernetes admins have some challenges. First, not all implementations are exactly the same. For example, Amazon has the Elastic Kubernetes Service, Microsoft has its Azure Kubernetes Service, and Google (which invented Kubernetes) has its own Kubernetes Engine in the cloud. They're similar, but they come with different ecosystems and plugins.

People also still tend to overprovision Kubernetes because they don't want the service disruption involved in the short time it takes to spin up another container. So they'll just provision extra ones and have them sitting idle.

Serverless? Sorry.

If you thought serverless would solve the scalability and over-provisioning problem, hold that thought. Serverless functions seem like a good way to only use computing power when you need it. On the surface, they don't use containers or VMs at all. Instead, they keep your code handy and just run it when triggered by an event.

If a customer is using functions infrequently, costs savings are there, because the charging model is fractional - you don't pay for running CPUs and VMs. However if you move an entire application to functions, and consume just greater than 1 VM, you pay a premium. What cloud operators give you with one hand, they take back with the other, and that premium bites harder the more you scale. Neither are serverless environments interoperable. Each service provider still has its own walled garden, serverless or not.

Faced with all these challenges, automating your Kubernetes environment is probably your best protection against cloud shock and incompatible environments. The FinOps Foundation certainly thinks so. The nonprofit organization, which is part of the Linux Foundation, studies financial operations to reduce cloud spend, and it's a big advocate of automation for tasks like provisioning.

The FinOps Foundation just noted that few companies are automating in practice. More than four in ten of the people it surveyed were cloud 'crawlers' who were just getting the basics in place. Another 41 percent were established but needed more maturity, while only 15 percent considered themselves evolving and mature. Nearly half of all respondents had little-to-no automation in place, and only 18 percent automated infrastructure changes.

Time to automate

So there's lots of headroom for improvement, but how can you do it? Leon Kuperman, CTO of CAST AI, has some ideas. The company's cloud optimization tool handles the provisioning and auto-scaling of Kubernetes pods for you, negotiating the resources in real time with the cloud service provider on your behalf.

"Granular auto-scaling lets us be very crisp with the actual resources that are required, dictated by the application and the actual traffic or workloads you're serving, rather than by a predetermined DevOps formula," he says.

The company also has what Kupler calls "sub-optimizations". It looks for spot instances, which are time-sensitive instances provided from the service provider's excess computing capacity. You can save up to 90 percent in costs by using an AWS EC2 spot instance compared to on-demand pricing. CAST AI also supports GCP Preemptible instance, and Azure Spot instances are coming this quarter.

The problem is that these instances are ephemeral; the service provider can interrupt service, yanking a spot instance to serve a higher-paying customer in a process politely known as 'pre-emption'. "That's why most customers don't use them," points out Kuperman.

Cast AI solves this problem by running multiple Kubernetes pods at once based on spot instances. If the service provider pulls a spot instance, the system keeps on running, and CAST's service keeps an extra pod or two running to avoid service disruption.

Using spot instances already saves the customer money, but CAST AI has another trick up its sleeve: AI-based provisioning. "By capturing all of the pre-emptions across our platform across all customers, we're able to build this unified machine learning model that predicts in advance when an instance is likely to get pre-empted," Kuperman says. "When it's likely over a certain threshold, we'll go in and proactively provision the needed capacity, solving the problem before it occurs."

The company refines that process with models that take cyclical workloads into account. Some workloads have a level of predictability that help it hone its predictions. That kind of just-in-time provisioning is more efficient than just throwing a bunch of on-demand instances at a workload based on back-of-an-envelope calculations. The company claims some impressive results from this automation.

Kuperman says it can save around 50 percent of the core computing cost on an average monolithic Docker-based application. Move to microservices, and it's more like 75-80 percent, he says. The company is now launching a cost estimate tool powered by agents that it installs in your existing Kubernetes environment. The software will recommend the best instance types for you and conducts what-if analyses for automatic scaling scenarios.

Solving the high availability problem

CAST AI can also support multi-cloud operations by abstracting different cloud service providers' idiosyncrasies, hiding them from the customer's Kubernetes implementation. It runs active-active instances across multiple clouds, giving customers hot fail-over capability. "With the advent of distributed database technologies, we can now have fully active-active capabilities," Kuperman says. "I can have a whole cloud go down and my application wouldn't miss a heartbeat."

With an on-premises option for hybrid cloud users in the works, CAST AI is on a mission to help you make your cloud operations more efficient wherever it's located. Perhaps with a little help, the cloud community can get those dire-looking FinOps automation numbers up - and bring some budget down. After all, we reckon Jeff Bezos has made enough money already this year.

Sponsored by CAST AI


Biting the hand that feeds IT © 1998–2021