At KubeCon + CloudNativeCon Europe 2025, NVIDIA introduced the open-source KAI Scheduler, bringing fractional GPU support to Kubernetes. In this blog, Exostellar CTO Zhiming Shen puts KAI to the test by running real-world vLLM workloads on a single NVIDIA T4 GPU—and comparing the results with Exostellar's Software-Defined GPU (SDG) platform.
The experiment reveals a key limitation: while KAI allows scheduling of fractional GPUs, it lacks memory isolation and usage enforcement, causing one model to crash due to resource contention. SDG, on the other hand, provides true GPU virtualization, delivering isolation, observability, and stability without tuning or code changes.
The blog offers a technical breakdown of KAI vs. SDG and explores why GPU sharing isn’t just a scheduling problem—it’s also an isolation and reliability problem.
At KubeCon + CloudNativeCon Europe 2025, NVIDIA introduced the open-source KAI Scheduler, bringing fractional GPU support to Kubernetes. In this blog, Exostellar CTO Zhiming Shen puts KAI to the test by running real-world vLLM workloads on a single NVIDIA T4 GPU—and comparing the results with Exostellar's Software-Defined GPU (SDG) platform.
The experiment reveals a key limitation: while KAI allows scheduling of fractional GPUs, it lacks memory isolation and usage enforcement, causing one model to crash due to resource contention. SDG, on the other hand, provides true GPU virtualization, delivering isolation, observability, and stability without tuning or code changes.
The blog offers a technical breakdown of KAI vs. SDG and explores why GPU sharing isn’t just a scheduling problem—it’s also an isolation and reliability problem.