Edit Content

NANCY project has received funding from the Smart Networks and Services Joint Undertaking (SNS JU) under the European Union’s Horizon Europe research and innovation programme under Grant Agreement No 101096456. 

Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the SNS JU. Neither the European Union nor the granting authority can be held responsible for them.

Project Data

Building Self-Healing Networks Beyond 5G

Posted

on

|

Authors: Simone Gentile, Andrea Wrona, and Emanuele De Santis

Organization: Consortium for the Research in Automation and Telecommunication (CRAT)

Introduction: Towards Self-Healing

Imagine a world where our digital networks never fail, constantly adapt, and even repair themselves. This is one of the goals of the NANCY project for the “Beyond 5G” era. As our digital lives become increasingly interconnected, from smart cities to remote surgery, networks grow in complexity, making them more vulnerable to failures and disruptions. Traditional human-led troubleshooting can no longer keep up: it is too slow and prone to error.

NANCY is building networks capable of detecting, fixing, and preventing issues in real-time. Shifting from reactive human intervention to proactive, autonomous intelligence promises substantial reductions in operational costs for network operators and significantly improved reliability for end-users.


The Problem: When Networks Break Down

Our digital infrastructure is growing denser and more complex, integrating countless IoT devices, advanced edge computing, and software-defined components. While this complexity enables incredible new services, it also makes networks highly susceptible to failures, external attacks, and unexpected issues.

Traditionally, manual fault-handling methods are simply overwhelmed, due to:

  • Centralized Bottlenecks: Relying on a central point to collect and process all network data is slow, creates a single point of failure, and cannot handle the immense data volumes of B5G networks.
  • Privacy Concerns: Centralized data aggregation often means sharing sensitive information across network domains, raising significant privacy and compliance issues under strict regulations.
  • Lack of Adaptability: Existing models struggle to adapt to changing network conditions and often require large, representative datasets, which are hard to come by for rare anomalies.
  • Vulnerability to Attacks: A single, centralized system is an attractive target for cyberattacks, risking widespread network disruption if compromised.

NANCY’s Breakthrough: Intelligence at the Edge

NANCY’s innovative solution is to decentralize network intelligence, pushing it closer to where data is generated: the “edge” of the network. This approach enhances robustness, scalability, and privacy by allowing individual network components to act as intelligent, collaborative agents.

The project uses advanced AI and Machine Learning to make networks smarter and more efficient. Federated Learning enables operators to train shared models without exposing sensitive data. Deep Reinforcement Learning allows networks to learn and adapt through trial and error. Game Theory ensures traffic naturally balances itself, while heuristics provide fast, practical solutions for specific challenges.


NANCY’s Self-Healing: Networks That Fix Themselves

NANCY has developed a comprehensive suite of intelligent algorithms for network self-healing and resilience.

Privacy-Preserving Anomaly Detection

Detecting anomalies quickly is vital, but network data is often sensitive. NANCY’s solutions to autonomous anomaly detection in B5G networks include:

  • AdaLightLog: Federated Learning to detect software irregularities in application logs on edge servers. It shares only aggregated insights (model updates), ensuring robust security and confidentiality.
  • Federated Learning for Intrusion Detection System (FL-IDS): A distributed machine learning system that detects network anomalies and security threats. Raw network traffic data never leaves its origin, protecting sensitive information and reducing bandwidth needs.
  • Federated Random Forests: Specifically targets “cell outages” in 5G Radio Access Networks. It uses a consensus-based approach where Base Stations collaboratively learn feature importance for anomaly detection, mitigating the impact of malicious nodes.

Autonomous Recovery and Optimization

Once an anomaly is detected, the network must recover efficiently to come back to its normal operations. In NANCY, this is carried out in the following ways:

  • Federated User Reallocation: When a cell turns into a “sleeping” one (a non-functional base station), this Federated AI approach enables neighbouring cells to collaboratively reassign affected users. It balances signal quality and network load, minimizing service interruption and maintaining stable performance. Simulations show it achieves a superior balance across conflicting objectives compared to simpler methods.
  • Deep Reinforcement Learning Control for Self-Healing: After user reassignment, this DRL algorithm dynamically adjusts antenna orientation (azimuth and tilt) and transmission power for cells that absorbed new users. It learns to balance performance for reassigned users with efficient power usage, optimizing service quality and enhancing resilience.
  • Adversarial Dynamic Healing based on Game Theory: For critical IoT services, this game theory-inspired approach ensures efficient load distribution. Each IoT device “selfishly” chooses the edge server that minimizes its own perceived delay, leading to a natural balance even if some servers degrade due to anomalies. Simulations show the system autonomously rebalances to a new equilibrium even with significant capacity reductions.

The Impact: A More Reliable, Secure, and Efficient Future

NANCY’s innovations are set to offer profound benefits:

Benefits for Network Operators:

  • Reduced Downtime & Operational Costs: Automated self-healing means fewer manual interventions and faster problem resolution, leading to lower operational expenses and higher network availability.
  • Enhanced Security Posture: Decentralized, privacy-preserving threat detection and robust defenses create a more resilient and trustworthy network.
  • Improved Scalability: Distributing intelligence to the network edge allows B5G networks to handle immense data volumes and device density without bottlenecks.
  • Compliance with Data Regulations: The privacy-by-design approach, especially through Federated Learning, helps operators meet strict data protection regulations.

Benefits for End-Users:

  • Seamless Connectivity & Higher Quality of Service: Self-healing networks mean fewer dropped calls, faster downloads, and consistently reliable access to critical applications, even during network disturbances.
  • Greater Trust in Network Security: Knowing that personal data remains private and the network actively defends against cyber threats builds confidence in digital services.
  • Optimized Performance: Intelligent load balancing ensures users consistently receive the best possible performance, even under challenging network conditions.

These combined benefits create a powerful multiplier effect, reducing economic losses, fostering innovation in critical applications, and increasing societal trust in our digital infrastructure.