Organisations are increasingly required to support AI and VDI workloads as part of their infrastructure strategy. This demand is being driven by a combination of factors, including the growth of AI and machine learning workloads, the continued relevance of virtual desktop infrastructure (VDI), and the need to support graphics-intensive applications such as CAD, rendering, and simulation.
Many of these workloads rely on GPUs and other AI accelerators – such as NPUs and FPGAs – which provide the specialised processing capabilities required for parallel computation, inference, and graphical rendering.
At the same time, service providers are exploring GPU-based offerings as a way to expand their portfolios, introducing new consumption models such as GPU-as-a-Service.
In parallel, many organisations are already undergoing infrastructure modernisation. Changes in virtualisation licensing, cost structures, and vendor strategies are forcing a reassessment of existing platforms. This often leads to broader initiatives aimed at improving flexibility, reducing operational complexity, and avoiding vendor lock-in. As a result, infrastructure teams are not only evaluating how to support new workloads, but also how to redesign their platforms to support them more efficiently.
These two trends are converging. AI and VDI workloads are frequently introduced into environments that are already in transition, and are often treated as a separate requirement with dedicated platforms and tooling. While this approach can address immediate needs, it introduces additional layers of complexity and fragmentation over time.
This article examines an alternative approach: treating GPU resources and AI accelerators as part of a unified infrastructure model. Instead of building separate stacks for AI, VDI, and graphics workloads, organisations can integrate these resources directly into their IaaS platform. This enables a consistent operational model across different workload types and deployment environments, including both datacenter and edge scenarios.
The Deeper Problem: Fragmentation at the Resource Layer

The fragmentation of infrastructure is not limited to the use of multiple platforms. It also exists in the way resources are modelled and managed within those platforms.
In many environments, different types of hardware are treated using separate abstractions and operational models. GPUs, for example, are often handled differently from CPU and memory resources, requiring dedicated tooling, specialised configuration, or vendor-specific integrations. The same applies to AI accelerators.
In some architectures, these resources are managed through separate services or control planes, each introducing its own APIs, configuration models, and operational requirements. This creates an additional layer of complexity that goes beyond platform fragmentation.
As a result, infrastructure teams are not only required to manage multiple platforms, but also to operate different resource models for different types of hardware within the same environment.
This leads to a loss of consistency at the infrastructure level. Provisioning, allocation, and lifecycle management are no longer uniform processes. Instead, they vary depending on the type of resource being used, increasing operational overhead and reducing predictability.
A Consistent Resource Model
A more effective approach is to treat all infrastructure resources using a consistent model, regardless of their type.
From this perspective, GPUs and AI accelerators should not be handled as exceptions. They should be integrated into the same resource model used for CPU, memory, and storage.

This requires a consistent approach to device discovery, resource allocation, and lifecycle management across all resource types within the infrastructure.
This approach is implemented in Apache CloudStack by extending its existing IaaS architecture to include these resources as part of the standard framework. Rather than introducing separate services or abstractions, CloudStack integrates them directly into the core platform.
GPUs and AI accelerators are discovered at the Host level and exposed through standard constructs such as Compute Offerings. This allows workloads that depend on these resources to be provisioned using the same mechanisms as traditional virtual machine instances.
At the service level, this model enables GPU resources to be defined as part of standard infrastructure offerings. Service providers and enterprise operators can create Compute Offerings that include specific GPU configurations, making these resources consumable in a controlled and repeatable way.
This is particularly relevant in multi-tenant environments, where resource consumption needs to be standardised, metered, and exposed through clearly defined service tiers. GPU-enabled offerings can be presented alongside traditional compute offerings, enabling consistent provisioning across different workload types.
As a result, there is no need for separate control planes, dedicated orchestration layers, or different operational models depending on the type of resource being used.
This approach enables a consistent infrastructure model where CPU, memory, storage, and GPU resources are managed in a unified way, and where resource consumption is exposed through standardised service offerings. Workloads that require these capabilities can be deployed, scaled, and managed using the same workflows already established for other workloads.
This reduces operational complexity and improves consistency across environments, including both datacenter and edge deployments.
Business Impact: From Infrastructure Capability to Service Delivery
A consistent infrastructure model is not only a technical improvement. It directly impacts how services are designed, delivered, and consumed.
By integrating GPU resources and AI accelerators into the same model used for compute, organisations can move from managing hardware as a special case to exposing it as part of their standard service portfolio.
For CSPs and MSPs: Service Expansion and Monetisation
For service providers, GPU resources represent both a technical capability and a commercial opportunity.
When these resources are integrated into the same infrastructure and service model, providers can extend their portfolio with GPU-enabled virtual machine instances, offer GPU-as-a-Service using existing provisioning and billing models, standardise service definitions through Compute Offerings, and apply quotas, metering, and multi-tenant isolation consistently.
This enables providers to introduce new services without deploying separate platforms or creating parallel operational models.
In practice, this reduces time to market and simplifies ongoing operations. Instead of building and maintaining dedicated platforms, providers can extend their existing cloud infrastructure to support these workloads.
For Enterprises: Standardisation and Operational Efficiency
For enterprise environments, the impact is primarily operational.
Organisations often introduce GPU resources in an ad hoc manner, resulting in isolated environments and inconsistent management practices. Over time, this leads to fragmentation and increased complexity.
By adopting a unified infrastructure model, enterprises can standardise how GPU resources are provisioned and consumed, integrate AI, VDI, and graphics workloads into existing platforms, apply consistent governance and access control, and reduce dependency on specialised or isolated environments.
This improves predictability and simplifies operations, particularly in environments with multiple teams and workload types.
A Shared Outcome: Reduced Complexity and Increased Flexibility
Despite different objectives, both service providers and enterprises benefit from the same underlying outcome: a single platform for multiple workload types, a consistent operational model, reduced infrastructure fragmentation, and greater flexibility in how resources are allocated and consumed.
This allows organisations to evolve their infrastructure without introducing additional layers of complexity.
Takeaway
AI and VDI workloads are often introduced as separate infrastructure challenges, leading to fragmented architectures and increased operational complexity. In practice, these workloads share a common requirement: access to GPU resources and AI accelerators within a consistent and controlled environment.
By integrating these resources into the core infrastructure model, organisations can avoid the need for multiple platforms and specialised operational layers. This enables a unified approach where different workload types are provisioned, managed, and consumed using the same mechanisms.
The result is a more consistent and scalable infrastructure, capable of supporting both current and emerging workload requirements without increasing complexity.
The post AI, VDI and GPUaaS: Rethinking Infrastructure with Apache CloudStack appeared first on ShapeBlue.
from CloudStack Consultancy & CloudStack... https://ift.tt/k3zixVX
via
IFTTT