Why cloud costs get out of control: Too much lift and shift, and pricing that is 'screwy and broken'
The Reg talks to the experts about how to manage spend
Feature Spinning up services on public clouds is dead easy, but what about staying in control of the bill?
Organisations are "over budget for cloud spend by an average of 23 per cent, and expect cloud spend to increase by 47 per cent next year," according to a "State of the cloud 2020" report by Flexera, based on a survey of 750 technical professionals.
As if that weren't bad enough, respondents self-estimate that 30 per cent of cloud spend is wasted. COVID-19 has, if anything, made the problem worse, with most respondents saying the pandemic has increased planned cloud usage.
Adrian Bradley is a CIO Advisory director at KPMG advising customers on cloud cost management. "Our clients find three things going wrong," he told The Register. "The first is that it costs them more than they anticipated to get to the cloud. Second, when they get to the cloud they find that they're spending more than they expected, and quite often more than they historically spent. Third, they don't feel they're getting the value from that spend."
The biggest problem, said Bradley, is that organisations "make a lot of compromises" moving to the cloud because the level of digital transformation needed to get the full benefit is not there.
In other words, too much lift and shift. "Enterprises have not made that choice because they're lazy, but because that is what was affordable," said Bradley. "The key thing is, if you do have to lift and shift, don't stop there. It's a new variant on the technical debt story."
The reason this is more expensive is that more applications end up running on virtual machines rather than taking advantage of pay-as-you-go services. "They don't get the economies that are inherent in the utility of cloud," said Bradley.
According to Flexera's State of the cloud report, organisations reckon they are over-budget on cloud spend by 23 per cent
Another factor is that the cloud is not static. "Newer like-for-like services around compute and storage are generally cheaper. If you migrate to the cloud and do nothing, you get this erosion of value. The pricing tends to reward those to invest each year to move to newer versions," said the KPMG advisor.
All big cloud providers offer chunky discounts in return for a future purchase commitment, but that only saves money if those resources are ... fully utilised
Built-in cost optimisations can save money, but Scott Chancellor, chief product and technology officer at cost consultant Apptio, told us that customers are prone to "overestimate the extent to which they need a savings plan or reserved instance or or some other sort of commitment".
The problem here is that all big cloud providers offer chunky discounts in return for a future purchase commitment, but that only saves money if those resources are in fact fully utilised.
Another aspect of this problem is that the people responsible for optimising cost tend to be different from those who build the technology. "Imagine you're an engineer deploying a new application," said Apptio engineering veep Abuna Demoz. "You want your application to work. Estimating how much capacity you need, how much compute you need, how much storage you need is very hard to do in advance so an engineer will typically overprovision, with the best intentions of going back later and adjusting. What often happens is you move on to the next project, you never went back."
Likewise, Bradley said there's a risk that "technical people make decisions on provisioning that get disconnected from what the business actually needs".
Too much choice
The sheer number of services on offer is also a problem. "Customers have a difficult time understanding which SKU among a similar bunch of offerings to choose from," said Chancellor. "Rather than using, say, a pre-built data analytics solution from AWS, maybe it makes sense to cheaply store data in S3 buckets and use one of the lower-cost options to query that data, and visualise that via a free open-source tool."
Chancellor, who formerly worked on cost-management tools at AWS, rejected the idea that the big providers deliberately confuse customers or try to push them towards unneeded expense. "The mentality that I employed when I ran that business unit was what's best for the customers is best for the company in the long term."
Is one cloud provider better value than another? "No single cloud provider is more cost-effective overall," said Demoz. But there may be individual cases, such as specialised machine-learning workloads where "you may find one cloud provider is more cost-effective than the other".
There are differences in the way to optimise cloud spend. "AWS gives you a savings plan, where rather than committing to buy a certain size of VM over one to three years, you commit to an amount that you're going to spend over that time period, and so you get a little bit more flexibility," said Demoz. "GCP has a different offering. Sustained-use discounts let you discount your compute based on your high watermark of sustained usage over 7, 14 or 30 days. You don't actually have to decide in advance."
Worship at the altar of 'turn that shit off'
Corey Quinn (QuinnyPig) is chief cloud economist at the Duckbill Group and a specialist in AWS cost management. He told The Register that businesses are inclined to think their AWS bills are too big even when they are not. "Someone in accounting gets a bill from Amazon and it's enormous and their first response is: 'How many books is engineering buying? I don't see them reading that much.'"
The move to cloud has focused attention, he said. "If you go back to the time before cloud, companies were still spending this kind of money. But then you were talking about capital leases and 15 different vendors. Now it's opex [operational expenditure] to a single vendor and the bill looks like a telephone number."
How then does he go about saving customers money? "A bunch of stuff," said Quinn. "Step one. Worship at the altar of 'turn that shit off'. You're charged for what you forget to turn off. Then go out and buy reserved instances and savings plans, typical stuff."
It is also worth looking at application architecture from a cost perspective, Quinn said. "We had one customer that was taking in a giant pile of data from their customers, but 50 times that much data was being transferred between [AWS] availability zones. In a data centre that's effectively no-cost. In AWS or any cloud environment you are charged per GB for that, and that needs to be addressed."
How do you tell if stuff is running that should be turned off? "You can look at the APIs from the cloud providers and say, these instances are idle. We can't say, you should turn them off. Instead, we start there. Can you tell us more about them?"
Are some services better value than others and how can you tell? "It's context," said Quinn. "There are remarkably few AWS products I can point at and say yes, that's crap. A couple but not many."
Run and see
The problem is more nuanced because most products are priced piecemeal. "This is where it gets screwy and broken because there are so many different pricing dimensions. How many IOPS? How many reads and writes does your database do in a month? The correct answer is: 'I don't know.' No one does."
There is a reason for the way the pricing works, said Quinn, based on "the underlying logic of what it costs AWS to provide the service", but it makes the cost "incredibly variable".
The solution is to run something and see what it costs. There is another problem, though. "As you're building things out and different divisions in your company do different things, it becomes increasingly difficult to attribute costs. The bill is 20 per cent higher this month and tells you its was S3. Cool, that's storage, but that tells me nothing. What business activity happened?"
Apptio's Cloudability dashboard attempts to identify potential savings, for example by using cheaper disk instances where analysis indicates they will meet requirements
Quinn, like Apptio, does not think one cloud provider is better value than another. "The big three have largely similar pricing structure," he said. "Some companies are more willing to come up with significant discounts." Cloud vendors are not trying to win business from one another, he said, but "to get people out of their data centres".
He has one piece of advice: "Don't build a thing with an idea of being able to run it everywhere. In most cases it's more work than value."
The big cloud vendors offer built-in cost analysis tools, like this one on Azure, but they will never tell you to look elsewhere for cost savings
Are the built-in cost recommendations that the AWS, Azure and GCP provide reliable guides? "There's nothing inherently wrong with most of their recommendations," said Quinn. "Obviously they are not going to tell you how to negotiate against them. They're not going to tell you, use a CDN like Fastly or Akamai. They're always going to push their own services.
"I don't know anyone who looks at their cost approach and what they're doing and doesn't feel ashamed because they're doing it wrong. You can always do it better. The trick is to understand that it's a process." ®