Region
United States
Industry
Biotechnology, genome testing
GeneDX accelerates precision-medicine research with event-driven AI/ML pipelines that run on Kubernetes across Oracle Kubernetes Engine (OKE) and Azure Kubernetes Service (AKS). Two research teams share a multi-cloud platform that must scale quickly to meet tight turnaround-time (TAT) requirements for genomic analyses.
Team | Cloud | Pre-Smart Karpenter Workflow |
Team A | OKE | Static node group sized for peak. |
Team B | AKS | Identical pattern; idle nodes during troughs, slow starts during spikes. |
Karpenter was not available on Oracle Cloud Infrastructure (OCI), so the OKE clusters had no concept of just-in-time nodes, also, scaling across the estate was reactive and over-provisioned.
Smart Karpenter fuses Avesha Smart Scaler with Karpenter to predict pod demand in advance and provision the exact nodes required.
Capability | Impact at GeneDX |
Predictive pod scaling: RL models analyze latency, RPS, and service dependencies | Pods launch before a spike; queues disappear. |
Dynamic node provisioning: predictions drive Karpenter for right-sized nodes | No idle padding; nodes spin up/down in < 60 s. |
Observation → Optimize rollout | Two-week shadow run before full AI control. |
Continuous learning | Scaling stays accurate as workloads evolve. |
KPI | Before | After Smart Karpenter | Delta |
Average node CPU utilization | 48 % | 82 % | +71 % |
Idle node-hours / month | 1 900 | 520 | −73 % waste |
P95 pod queue time | 5 m 10 s | < 45 s | 6.8 × faster |
SLO violations (job TAT) | 12 / month | 0 | 100 % compliance |
Cloud compute spend | Baseline | −33 % | Savings fund new research lines |
“Smart Karpenter makes Karpenter proactive. Nodes appear before the load hits and disappear immediately after, cutting a third of our cloud bill while keeping turnaround times rock-solid.” - Director of Genomic ML Platforms, GeneDX
With Smart Karpenter, GeneDX: