Customers & Partners
FAQ
nebius.png

Region

Global

Industry

Industry Cloud Computing

InpharmD Accelerates AI-Powered Research with Avesha EGS GPU Bursting from On-Prem to Nebius

Overview

InpharmD is a leading AI-driven pharmaceutical-intelligence platform that synthesizes drug data, clinical guidelines, and real-world evidence to help pharmacists, researchers, and clinicians make critical treatment decisions. Its engine continuously ingests and processes large data sets—often in batch mode—to deliver precise, timely, and customized recommendations. 

To guarantee top-tier performance and availability during unpredictable surges in demand, InpharmD adopted Avesha’s Elastic GPU Service (EGS) to seamlessly burst inference workloads from its primary on-premises Kubernetes cluster to Nebius Cloud—with zero code changes and full operational transparency.

Challenge

InpharmD’s inference platform depends heavily on GPUs for real-time and batch processing of complex clinical queries. Running exclusively on its on-prem cluster created three bottlenecks:

  1. GPU Capacity Crunches at Peak Load  
    Static on-prem allocations could not keep pace with unpredictable, bursty workloads.  
     
  1. Escalating Expansion Costs  
    Expanding on-prem GPU capacity required long procurement cycles and high capital expense, especially for overnight or ad-hoc batch jobs.  
     
  2. Latency-Sensitive Research Workflows  
    Any delay in inference risked slow or incomplete insights for pharmaceutical partners operating in real time.  

    InpharmD needed a way to elastically—and intelligently—extend GPU capacity to the cloud whenever its on-prem cluster was saturated, while preserving observability and automating placement decisions.

Solution: Avesha EGS GPU Bursting to Nebius


With Avesha EGS, InpharmD created a hybrid GPU workspace spanning its on-prem cluster and two Nebius regions. Each time the platform spins up a new inference endpoint, EGS automatically evaluates capacity and latency: 

• If on-prem has headroom, the endpoint is deployed locally for the lowest possible latency. 

• If on-prem is full, EGS transparently bursts the workload to the optimal Nebius region—no DevOps tickets, no redeploy, no downtime.

EGS offers two complementary bursting modes:

Bursting Mode

How It Works

Outcome for InpharmD

1. CustomerManaged Workloads in EGS Workspace InpharmD deploys its AI services into the EGS-managed Kubernetes workspace. EGS provisions GPU nodes in Nebius as needed and schedules pods to the region with the best capacity-vs-latency profile.Instant access to Nebius GPUs without managing remote clusters or rewriting manifests.
2. Model-Centric Bursting with AutoProvisioning InpharmD specifies a model (e.g., Hugging Face transformer, NVIDIA NIM, proprietary clinical-NLP). EGS finds capacity, deploys the model, and scales it automatically.Rapid R&D iterations and ondemand clinical queries—no infrastructure management required.

Quote from InpharmD

“With Avesha EGS, we dynamically optimize every workload—maximizing performance without overpaying for idle resources. We scale seamlessly and keep research costs predictable.”  
— Tulasee Rao Chintha, CTO, InpharmD— Tulasee Rao Chintha, CTO, InpharmD

Key Benefits for InpharmD

  • Cost Optimization:   
    Bursting to Nebius on demand avoids costly over-provisioning of on-prem GPUs, cutting total compute spend an estimated 25–40 %.   
     
  • High Availability:   
    Multi-region Nebius failover protects service continuity during local hardware failures or maintenance windows.   
     
  • Full-Stack Observability: The EGS dashboard provides real-time views of GPU utilization, cost, and efficiency across on-prem and Nebius estates.  
     
  • Secure & Seamless: Workload placement and data movement occur over encrypted service-to-service channels; no manual intervention or exposure of internal networks.

Strategic Impact   

By shifting from a fixed on-prem model to a hybrid GPU environment that “chases capacity,” InpharmD can: 

  • Deliver clinical insights faster during critical decision windows.   
     
  • Innovate rapidly with new models—without waiting for hardware buys.  
     
  • Align compute spend with actual demand, improving margins and budgeting accuracy.

Conclusion

InpharmD’s success with Avesha EGS showcases the power of intelligent GPU bursting between on-prem infrastructure and Nebius specialized GPU Cloud. With frictionless workload mobility, model-aware deployment, and deep observability, EGS gives InpharmD the agility, reliability, and cost-efficiency required to lead in AI-powered pharmaceutical intelligence.