Customers
Resources
analyst-report.svg

Analyst Reports

Navigating Key Metrics for Growth and Success

blog.svg

Blog

Source for Trends, Tips, and Timely Topics

docs.svg

Documentation

The Blueprint for Mastering Tools and Processes

sandbox.svg

Sandboxes

Explore interactive sandboxes for Avesha products

line
news.svg

News/Pubs

Bringing You the Top Stories as They Happen

videos.svg

Videos

Explore Our Library of Informative and Entertaining Clips

whitepapers.svg

Whitepapers

Exploring Critical Topics with Authoritative Research

roi.svg

ROI Calculator

Easily Track and Maximize Your Investment Returns

line
egs-marketing

Optimize Your AI with Elastic GPU Service (EGS)

Company
about-us.svg

About Us

Discover Our Mission and Core Values

careers.svg

Careers

Join Our Team and Shape the Future Together

events.svg

Events and Webinars

Connecting You to Trends, Tools, and Thought Leaders

support.svg

Support

Helping You Navigate Challenges with Ease

FAQ
hack_your_scaling
Raj Nair

Raj Nair

Founder & CEO

1 July, 2024,

2 min read

Copied

link

Hack your scaling and pay for a European Escape?  Do you feel you have paid for a lot of cloud capacity waiting for traffic. If so, you are not alone, according to a recent analysis [1] of AWS spend, the average resource utilization for compute across all AWS customers is just 6%, with more than 90% of compute capacity paid for and unused.

Developers In a medium-sized payments company that I spoke to recently were comfortable only at 30% utilization to leave room for spikes – at least, that is the conventional wisdom. Moreover,  the tool for horizontal scaling in K8S, HPA, gives you by default a cpu utilization threshold to trigger scaling. Hence, if you set your threshold to say 30%, then it traps you at 30% because it scales to more pods if the utilization exceeds 30%. Even if you used Karpenter for optimizing your nodes, you are only eliminating wasted space around the pods in the node and not inside the pod.

You need a better way to increase the utilization of pods without compromising your SLO. Performance tuning perhaps? But, you don’t have the time to optimize your application with over a hundred microservices that keep getting updated every week as in the case of the aforementioned company. What if you could use GenAI to sort your performance issues and scale pods automatically? That’s exactly what this payments company did. At the low end of what they saw, a 30% improvement resulting in a net savings of $200K for every $1M of spend – more than enough to get a fancy European vacation for each member of your team. After all, you need to do something with all that time you don’t need to spend manually scaling for spikes.

[1] Subramaniam, R. (2023). AWS Cost Optimization: Best Practices from Analyzing $1 Billion Spend. CloudFix. Retrieved from AWS Cost Optimization: Best Practices from Analyzing $1 Billion Spend

Related Articles

card image

Scaling RAG in Production with Elastic GPU Service (EGS)

card image

Optimizing GPU Allocation for Real-Time Inference with Avesha EGS

card image

Do You Love Your Cloud Credits? Here's How You Can Get More…

card image

#1 Myth or Mantra of spike scaling – "throw more resources at it."

card image

The APM Paradox: When Solution Becomes the Problem

card image

Migration should be 'gradual' and 'continuous'

card image

Hack your scaling and pay for a European Escape?

card image

Here Are 3 Ways You Can Slash Your Kubernetes Costs by 50%

card image

A completely new way for K8s Autoscaling: Why Predictive Pod Scaling with Smart Scaler and Karpenter is needed before plain VPA

Copyright © Avesha 2024. All rights reserved.

Terms and Conditions

Privacy Policy

twitter logo
linkedin logo
slack logo
youtube logo