5 Ways To Stop Your Cloud Costs From Ballooning Beyond Your Control
Cloud computing costs can quickly spin out of control if not monitored effectively. Here are 5 ways your organization can keep tabs on cloud spending and avoid spending more than it can afford.
Cloud computing has made data processing resources available to enterprises of all sizes and scales that were almost unimaginable a couple of decades ago. Now, every business has access to nearly unlimited processing, networking, and storage. However, along with nearly unlimited resources comes the potential for nearly unlimited spending. An out-of-control process can get very expensive very fast. We’ve all heard the horror stories about a rogue application getting out of control and spending more than expected. In some cases, a lot more than expected. Have you read about the “free” trial that generated an overnight $72,000 bill? In this article, we discuss five ways you can keep a tab on your cloud spending and prevent the costs from ballooning out of control.
The issues surrounding free tier and free trials on all major cloud platforms have been getting a lot of visibility in recent times. For businesses, the problem is different as companies aren’t looking for free services but ways to control expenditures. A regular, predictable expense is best, but even ensuring that they will not experience surprise cost overruns would help CFOs sleep at night.
There are two major issues with expense overruns in the cloud. We have unexpected bursts of expenses and the issue of constant rate growth. An unexpected burst can occur when a service auto-scales and you use more resources than expected. One example of this is when your container service decides to scale to 1000 containers to keep up with a surge in business. If you normally use tens of instances and plan for 100 instances, a bump to 1000 instances can be a real surprise, even for just a few hours.
The constant growth issue is a side effect of having resources constantly available to you. Like an alligator in an aquarium, data seems to grow to fit the box it lives in. Permanent object storage, easily spun up databases, a VM for all your needs (or a VM for each developer), over-sized Kubernetes clusters, everything adds up a little bit at a time.
Learn More: 6 Tips for Managing Microsoft Azure Cloud Costs
5 Ways You Can Manage Your Cloud Costs
Below are 5 actions you can take to be aware of what you are spending your money on and ways to keep some control over those expenses.
Budgets & Alerts
All cloud providers offer their customers the ability to set up a budget. There is generally a console page where you can tie specific amounts to various services and projects and track expenses against your budget. You may think you could tie a budget to an actual spend limit and have the provider cut you off when you hit your maximum budget. As far as I know, none of the major cloud providers will do that.
If the provider decided to cut you off, what would it cut off? Would it stop all of your VMs? Shut down all of your databases? Cloud functions are nearly free, so would those still run? What if the cloud function initiated calls to other services? Would the function stop running? It’s easy to see why cloud providers don’t want to take ownership over your business needs.
Along with your budget, you can set up automated alerts. Suppose any of your budgets run past expected amounts (or even within certain percentages). In that case, you can have an email or text message sent to the appropriate parties to inspect and analyze the overrun. Note that the budgets and alerts are not usually real-time, so you could be well past the budget by the time you get the alert. Plan for that and set your alert thresholds to reasonable amounts.
Process Monitoring & Alerts
Another action you can take along the lines of alerts is to build monitoring into your processes. You may be able to use the cloud provider’s built-in logging, use a third-party platform, or write and build your own. The important part about monitoring is not so much about how much you are spending but knowing what is running, when it is expected to run, how long it normally runs, and how many resources it utilizes in a normal run.
The way you answer those questions is to track everything you run. This is related to observability but is not the same. If you utilize observability in your processes, this kind of monitoring should be an easy add. It helps you save a minimal amount of data on every process you run. You save exactly the information I mentioned above. You can save more, for sure, but at a minimum, you would need the process name, the time it ran, the length of time it ran, and the resources it used.
Run queries against this data and have the aggregate averages and maximums available to compare to the most recent run of any process. With the right framework, you can even compare to in-flight processes and catch issues before they break the bank. If any process passes set thresholds (say, an ETL job runs 1 hour longer than average or a cloud function uses 50% more memory than normal across 100 runs), raise an alert just as you would for a budget issue.
Learn More: Could Challenge to Data Transmission Costs Disrupt Cloud Service Pricing?
IaaS vs Managed vs Serverless
The following action is more of an analysis task. If you moved from on-prem to the cloud, you must have performed a lift-and-shift where you took existing servers from the data center and pushed the images/applications to VMs in the cloud. While this does make moving to the cloud simpler, you may end up spending a lot more than you need to. For one thing, you will need to constantly monitor and resize manually as business needs change, but worse is that you are likely over-provisioned, which does not make sense in the cloud at all.
This is the IaaS model, and it does make sense for many use cases. If you have a stable environment, predictable growth rates or even stagnant growth, running correctly sized VM instances will be cheaper and just as easy to manage as managed or serverless environments. However, if your needs are more dynamic, a managed environment will auto-scale as needed and help keep costs lower. Like a VM, you will have ongoing 24/7 costs. New applications can take advantage of serverless technologies where you only pay when the services are being used.
There is no one-size-fits-all or best way to do things. You need to look at each application and make that judgment. The best part here is that you can always evolve as needed.
Right-Size
If you are constantly monitoring and gathering trends, you will see where you are over-provisioned. You may have cloud functions provisioned for more memory or CPU than needed. The same may be true with containers. Perhaps your Kubernetes cluster does not need quite so many worker nodes. The action here is to be proactive instead of reactive. Aggressively monitor resource usage versus resource allocation. Scale back if you see wasted cycles or other resources.
Learn More: Public Cloud Spending To Grow 18.4% in 2021: Gartner
Inventories
My last suggested action is taking inventories that is slightly different from monitoring but can be considered a type of monitoring. As files are moved into and out of storage buckets, and as data is moved around and transformed between source and targets, you will likely end up with random copies of data that will sit around and consume the budget until the end of time.
One example that I have seen many times, and that I have been guilty of, is the issue of “just in case” backups. You may have a large file, let’s say 1TB. Not too incredibly big but not tiny either. You get it from a source of some kind and place it in a bucket. You then use it for analysis, maybe even load it into a database. Then what? Might as well keep it around just in case you need it, right? It’s only 1 TB. But if 10 people save 1TB per week for a year, that costs real money.
I can’t tell you how many intermediate ELT tables and dated table backups I have seen across all platforms. Taking backups is an understandable caution, but it just wastes space and money without some kind of plan. It is worth writing a script (or Google and finding one already written) to scan your object storage and databases for names that fit a pattern. Sometimes you can completely automate the removal of files that match a pattern and are older than a certain date. Or table names that do not match a pattern (if you have naming standards, you can do this). Or you may need to see who owns an object and have them clean it up.
However you do it, you can reclaim massive amounts of storage if you have never done this exercise and, if you do it regularly, you can save money.
Summary
Managing cloud storage costs may seem daunting or impossible for many organizations that constantly add data into cloud VMs or move their data between servers. However, following a few steps, viz, taking inventories, monitoring processes, and setting spending limits at an initial stage can help organizations keep tabs on how much they are spending on cloud resources and how the costs could be rationalized. Fixing a budget and setting up automated alerts for cost overruns can be a good starting point.
Do you think these 5 steps are sufficient for organizations to manage their cloud spend? Let us know on LinkedIn, Twitter, or Facebook. We would love to hear from you!