Hack your scaling and pay for a European Escape?

Raj Nair
Raj Nair

Founder & CEO

1 July, 2024

2 min read



Hack your scaling and pay for a European Escape?  Do you feel you have paid for a lot of cloud capacity waiting for traffic. If so, you are not alone, according to a recent analysis [1] of AWS spend, the average resource utilization for compute across all AWS customers is just 6%, with more than 90% of compute capacity paid for and unused.

Developers In a medium-sized payments company that I spoke to recently were comfortable only at 30% utilization to leave room for spikes – at least, that is the conventional wisdom. Moreover,  the tool for horizontal scaling in K8S, HPA, gives you by default a cpu utilization threshold to trigger scaling. Hence, if you set your threshold to say 30%, then it traps you at 30% because it scales to more pods if the utilization exceeds 30%. Even if you used Karpenter for optimizing your nodes, you are only eliminating wasted space around the pods in the node and not inside the pod.

You need a better way to increase the utilization of pods without compromising your SLO. Performance tuning perhaps? But, you don’t have the time to optimize your application with over a hundred microservices that keep getting updated every week as in the case of the aforementioned company. What if you could use GenAI to sort your performance issues and scale pods automatically? That’s exactly what this payments company did. At the low end of what they saw, a 30% improvement resulting in a net savings of $200K for every $1M of spend – more than enough to get a fancy European vacation for each member of your team. After all, you need to do something with all that time you don’t need to spend manually scaling for spikes.

[1] Subramaniam, R. (2023). AWS Cost Optimization: Best Practices from Analyzing $1 Billion Spend. CloudFix. Retrieved from AWS Cost Optimization: Best Practices from Analyzing $1 Billion Spend