We are thrilled to announce the official release of EGS version 1.15.0, now available as of Aug 22, 2025.
What is EGS?
Avesha Enterprise GPU Service (EGS) is a comprehensive solution for managing and optimizing GPU resources in data centers and cloud environments. It provides advanced dynamic allocation, scheduling, monitoring, and management capabilities to ensure efficient utilization of GPU resources for various workloads, including AI/ML training, inference, and high-performance computing.
Avesha EGS is designed to address the challenges of managing GPU-intensive workloads, offering a solution that enhances efficiency, scalability, and cost-effectiveness in AI operations.
Key Features of Avesha EGS
- GPU Resource Management
EGS provides a centralized platform to manage GPU resources across multiple Kubernetes clusters. It enables unified dynamic allocation, quota enforcement, and monitoring of GPU usage. - User-Friendly Interface
EGS offers an intuitive user interface. This simplifies the management of GPU resources, making it easier for users to allocate, provision, monitor, and optimize their GPU workloads. - GPU Provisioning and Scheduling
A user can provision GPU Resources to workloads with fair scheduling and quota enforcement across Workspaces. Pre-checks of node health are performed before allocation to reduce scheduling failures. - Multi-Tenancy and Workspaces
Workspaces enable logical separation of teams, projects, or applications. Access and quotas are enforced through workspace policies. Role-based access control (RBAC) and service accounts are supported for secure access to clusters. - GPU Visibility and Monitoring
A real-time dashboard displays GPU utilization, health, and status. EGS integrates with Prometheus for multi-cluster metrics collection and provides node-level and workspace-level usage views for both administrators and users. - Seamless Integration
EGS integrates seamlessly with existing Kubernetes environments and supports various GPU types, making it a versatile solution for diverse AI workloads. - Cost Management and Optimization
EGS offers detailed cost analysis and optimization features. It allows users to monitor GPU usage and associated costs for Workspaces and workloads. This helps in reducing overall expenditure on GPU resources. - Security and Access Control
Security is enforced through RBAC for users, groups, and service accounts. Secure kubeconfig downloads are supported, along with audit logging for GPU usage and access activities.
Key Highlights in EGS 1.15.0
License Management
EGS introduces a licensing mechanism to manage GPU resources more effectively. GPU allocation is enforced based on the number of GPUs defined in the license file. License counts apply to all onboarded GPU resources across Kubernetes clusters.
EGS supports both Trial and Enterprise licenses.
To request a license:
- Contact the Avesha team with your organization’s details and usage requirements.
- For a Trail license, submit a request through the EGS Registration Page.
For detailed steps, see License Management.
Wait Time Improvements
The GPR workflow has been optimized to reduce scheduling delays.
- Requests are automatically matched to the most suitable GPUs and clusters.
- Queuing time is reduced, resulting in faster start times for workloads.
- Multi-tenant environments benefit from improved throughput and efficiency.
Visibility and Monitoring
EGS introduces extended visibility to enhance transparency and monitoring capabilities across the platform. Supports proactive monitoring and capacity planning.
Inventory Summary
The Inventory page displays the following:
- Consolidated cluster-wide view of GPU resources.
- Metrics include:
- Total number of GPUs
- Allocated GPUs vs. Idle GPUs
- Available GPUs
- Total GPU nodes
- Partially allocated nodes
For more information, see:
Dashboard Enhancements
The dashboard has been expanded with richer GPU metrics and workload analytics. You can drill into GPU usage patterns, performance trends, and workload distribution, enabling data-driven decisions for scaling and optimization.
For more information, see:

SDK and API Enhancements
The EGS Core API and SDK have been expanded to support:
- Programmatic GPU provisioning
- Inventory access
- Advanced monitoring
This allows for deeper integrations and automation-driven use cases.
Copied