Customers & Partners
FAQ

Avesha Resources / Blogs

Announcing EGS 1.15.0 Release

uma.jpg

Uma Tammanagoudar

Staff Technical Writer

Copied

Announcing EGS v 1.15.0.jpg

We are thrilled to announce the official release of EGS version 1.15.0, now available as of Aug 22, 2025. 

What is EGS? 

Avesha Enterprise GPU Service (EGS) is a comprehensive solution for managing and optimizing GPU resources in data centers and cloud environments. It provides advanced dynamic allocation, scheduling, monitoring, and management capabilities to ensure efficient utilization of GPU resources for various workloads, including AI/ML training, inference, and high-performance computing. 

Avesha EGS is designed to address the challenges of managing GPU-intensive workloads, offering a solution that enhances efficiency, scalability, and cost-effectiveness in AI operations. 

Key Features of Avesha EGS

  • GPU Resource Management     
    EGS provides a centralized platform to manage GPU resources across multiple Kubernetes clusters. It enables unified dynamic allocation, quota enforcement, and monitoring of GPU usage.
  • User-Friendly Interface     
    EGS offers an intuitive user interface. This simplifies the management of GPU resources, making it easier for users to allocate, provision, monitor, and optimize their GPU workloads.
  • GPU Provisioning and Scheduling     
    A user can provision GPU Resources to workloads with fair scheduling and quota enforcement across Workspaces. Pre-checks of node health are performed before allocation to reduce scheduling failures.
  • Multi-Tenancy and Workspaces     
    Workspaces enable logical separation of teams, projects, or applications. Access and quotas are enforced through workspace policies. Role-based access control (RBAC) and service accounts are supported for secure access to clusters. 
  • GPU Visibility and Monitoring     
    A real-time dashboard displays GPU utilization, health, and status. EGS integrates with Prometheus for multi-cluster metrics collection and provides node-level and workspace-level usage views for both administrators and users.
  • Seamless Integration     
    EGS integrates seamlessly with existing Kubernetes environments and supports various GPU types, making it a versatile solution for diverse AI workloads.
  • Cost Management and Optimization     
    EGS offers detailed cost analysis and optimization features. It allows users to monitor GPU usage and associated costs for Workspaces and workloads. This helps in reducing overall expenditure on GPU resources.
  • Security and Access Control     
    Security is enforced through RBAC for users, groups, and service accounts. Secure kubeconfig downloads are supported, along with audit logging for GPU usage and access activities.

Key Highlights in EGS 1.15.0

License Management

EGS introduces a licensing mechanism to manage GPU resources more effectively. GPU allocation is enforced based on the number of GPUs defined in the license file. License counts apply to all onboarded GPU resources across Kubernetes clusters.

EGS supports both Trial and Enterprise licenses.

To request a license:

  • Contact the Avesha team with your organization’s details and usage requirements.
  • For a Trail license, submit a request through the EGS Registration Page. 

For detailed steps, see License Management

Wait Time Improvements

The GPR workflow has been optimized to reduce scheduling delays.

  • Requests are automatically matched to the most suitable GPUs and clusters.
  • Queuing time is reduced, resulting in faster start times for workloads.
  • Multi-tenant environments benefit from improved throughput and efficiency.

Visibility and Monitoring 

EGS introduces extended visibility to enhance transparency and monitoring capabilities across the platform. Supports proactive monitoring and capacity planning.

 

Inventory Summary

The Inventory page displays the following: 

  • Consolidated cluster-wide view of GPU resources.
  • Metrics include:
    • Total number of GPUs 
    • Allocated GPUs vs. Idle GPUs 
    • Available GPUs 
    • Total GPU nodes
    • Partially allocated nodes

inventory_egs_blog.jpg     
For more information, see:

 

Dashboard Enhancements

The dashboard has been expanded with richer GPU metrics and workload analytics. You can drill into GPU usage patterns, performance trends, and workload distribution, enabling data-driven decisions for scaling and optimization. 

For more information, see:

dashboard_egs_blog.jpg

SDK and API Enhancements

The EGS Core API and SDK have been expanded to support:

  • Programmatic GPU provisioning
  • Inventory access
  • Advanced monitoring

This allows for deeper integrations and automation-driven use cases.