This article is more than 1 year old
IT systems capacity planning. This is hard ... but how hard? Inquiring minds wish to know
Share your experiences with us and fellow readers. Let's find out together
Reg Reader Survey Technology in the 2020s is very forgiving, particularly if our processing happens in the cloud. By this, we mean that if things start to perform suboptimally, the issue is usually quite easy to resolve.
Right-click -> Add storage. Right-click -> Add RAM.
Job done.
Which is fine, but it leads us into temptation – we don’t do capacity planning because the need to do so feels like it has gone away.
This is the case all through IT, of course. We get away with designing algorithms poorly because today’s ultra-fast CPU cores save our bacon through sheer speed. We don’t index our databases properly because solid-state storage rescues us when our queries do full table scans. The thing is, though, we get away with this approach most of the time, but definitely not all of the time.
In this survey – see below – we’re keen to find out the extent to which our readers have had to cope with changes in demand for capacity in their systems and, more importantly, how they have managed the capacity planning process. Many of us have had to scale up systems – particularly things like virtual desktop and VPN services – due to users being sent home to work during the COVID-19 lockdowns.
But some organizations will have kept capacity at roughly the same levels, and it’s likely that some have scaled down – perhaps through exploiting opportunities to finally get around to decommissioning resource-hungry legacy systems.
Systems perform great in the test environment but then tank when put live – often because the production database was ten times the size of the test one
We’re also interested in the science of performance and capacity planning. Most of us have come across systems that performed great in the test environment but then tanked when put live – often because the production database was ten times the size of the test one – but did we do anything to predict that?
Did we ask the users whether the app felt snappy enough during testing? Did we, for that matter, run up any electronic measures of performance and resource usage, or perhaps simulate the actions of hundreds of users with automation tools?
This correspondent was a performance tester in a previous life, and I can confirm how good it feels to know that the app will scale to 250 users thanks to the stats gathered by the test harness that simulated 250 users hammering it at once. And after go-live, did we keep asking the users and/or carry on with our electronic monitoring to gauge behavior against expected performance?
And, finally, what do we do in the long term? If you’ve devised a regime of user feedback or software-based monitoring during development testing, have you continued to use these tools – or something similar – in the medium and long term? Proactive evaluation has clear benefits, particularly if the systems are at a point where further scaling up would need new hardware or a step-up in cost.
Do please let us know your approach, warts and all, by taking part in our short survey below. There are three questions to answer. We'll run the poll for a few days and then publish a summary on The Register thereafter.
Don’t feel bad if you tick all the “we don’t do that” boxes, because there could be many reasons (not least time and cost) for not having a humongous capacity monitoring and planning regime. And if you tick all the “we do that in spades” boxes, try not to be too smug... ®