Edit Content

NANCY project has received funding from the Smart Networks and Services Joint Undertaking (SNS JU) under the European Union’s Horizon Europe research and innovation programme under Grant Agreement No 101096456. 

Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the SNS JU. Neither the European Union nor the granting authority can be held responsible for them.

Project Data

Temporal Guarantees in Virtualized Edge Applications in NANCY

Posted

on

|

Authors: Daniel Casini

Organization: Scuola Superiore Sant’Anna

As edge computing becomes the backbone of next-generation AI and robotics platforms, the ability to deploy containerized workloads with predictable timing and bounded CPU latency is emerging as a critical requirement. While container orchestration frameworks like Kubernetes offer unparalleled flexibility and scalability, they fall short when it comes to providing real-time guarantees.

The NANCY project addresses this challenge head-on by integrating the real-time capabilities of Linux in virtualized edge environments. This post outlines a comprehensive full-stack approach, encompassing real-time scheduling, runtime monitoring, and adaptive self-configuration.


Bringing Real-Time Awareness to Kubernetes

Kubernetes was originally designed for throughput-oriented applications—think web services and analytics pipelines. However, at the edge, many workloads are latency-sensitive and time-critical. These require strong temporal isolation to avoid delays and deadline or SLA violations.

To bridge this gap, NANCY has introduced KubeDeadline: a modular extension to Kubernetes that enables the use of Linux’s SCHED_DEADLINE scheduler for containers.

SCHED_DEADLINE is based on the Constant Bandwidth Server (CBS) model and provides per-task CPU bandwidth reservations. Each reservation is defined by:

  • A runtime (budget): the amount of CPU time guaranteed,
  • A period: the frequency at which the budget is replenished,
  • A number of cores.

These parameters collectively ensure guaranteed CPU bandwidth and bounded CPU latency, as long as the system is not overloaded and tasks adhere to their assigned execution constraints. In other words, SCHED_DEADLINE ensures bounded CPU latency and temporal isolation—exactly what edge workloads need.


Modular Integration with Kubernetes

KubeDeadline addresses this by building on top of Kubernetes’ Dynamic Resource Allocation (DRA) framework, a feature recently introduced by the community. Rather than patching the core, it introduces a set of modular components:

  • An RT-DRA driver that performs real-time aware admission control,
  • A custom controller that translates ResourceClaim and ResourceClass objects into real-time reservations,
  • A lightweight extension to the containerd runtime, enabling it to pass real-time parameters (e.g., rt_runtime_us, cpu.rt_period_us) to the Linux kernel via CDI (Container Device Interface).

This design makes KubeDeadline maintainable, aligning with Kubernetes’ evolution without requiring intrusive changes.


Supporting Modern Distributed Frameworks: Ray and AI at the Edge

To demonstrate practical adoption, KubeDeadline has also been integrated with KubeRay—a Kubernetes-native orchestrator for Ray, the popular framework for distributed AI workloads, used by organizations like OpenAI (ChatGPT), Netflix, ByteDance (TikTok), and Uber.

The integration allows Ray applications to request real-time CPU resources for latency-critical workers, such as those handling live video, audio, or involved in the inference of deep networks and AI. The modification required only lightweight changes to be added to Ray’s resource specification logic, enabling it to propagate ResourceClaim information when launching new containers.

This makes real-time scheduling compatible with modern AI execution models, without breaking the properties that make Ray so attractive.

This marks a turning point: containerized AI workloads can now be deployed with temporal guarantees, even in dynamic and decentralized edge environments.


Runtime Monitoring for Temporal Health

Defining real-time reservations is not enough. In dynamic systems, interference, load spikes, and misconfiguration can cause performance degradation.

To deal with this, NANCY introduces runtime monitoring mechanisms that observe the actual execution of real-time containers: to track budget overruns and underruns, so that they can adaptively change the budget parameter as needed in response to workload fluctuations.

Monitoring is implemented as a low-level agent that interfaces with the kernel’s scheduling statistics, exposing them to userspace or automated control loops. This makes temporal behavior observable, enabling operators to detect and respond to anomalies before they cause system-level failures.


Adaptive Period Estimation for Dynamic Workloads

In edge deployments, many workloads are:

  • Dynamically deployed or migrated,
  • Unknown a priori,
  • Composed of black-box AI pipelines or third-party components.

This makes manual configuration of real-time parameters impractical. To address this, NANCY introduces automated period estimation, based on tracing of kernel events and the Fourier transform.

The system samples execution times and inter-arrival patterns to compute a safe and efficient period for each workload.

This closes the loop: the orchestrator becomes not only real-time aware but also self-configuring, enabling autonomous optimization of temporal resources.


Real-Time Edge AI, Ready for Deployment

By combining:

  • Kubernetes-native real-time scheduling (KubeDeadline),
  • Transparent runtime monitoring, and
  • Online parameter tuning through period estimation,

NANCY delivers a complete stack for deploying real-time containerized applications at the edge.

This work enables safe co-location of latency-critical workloads, supports modern distributed AI frameworks, and adaptive orchestration strategies that respond to the real-world behavior of applications.

As edge platforms grow in complexity and importance, such capabilities will be essential to ensure predictability, safety, and trust in next-generation networks.