Observability has become mission-critical for troubleshooting cloud-native technology. However, today’s observability fails to meet the demands of cloud-native environments, either resulting in crippling complexity and high costs for collecting and storing huge data volumes, or sacrificing events coverage by sampling at coarse time granularity. We present µView, which stands out from conventional cloud monitors by incorporating a lightweight observability data-plane on Infrastructure Processing Units (IPUs). Our novel architecture leverages the proximity of IPUs to the monitored services to tackle observability bloat. Crucially, µView’s data-plane applies streaming data sketching techniques to continuously process and analyze microservice’s metrics at fine time resolution, without hurting application performance. We show for several use cases that by anticipating SLO violations µView can help (i) narrow the focus on informative observability data, and (ii) trigger useful signals about service performance, thus enabling timely proactive actions.
@inproceedings{cornacchiaobservability,title={Observability Is Eating Your Cores: Fine-Grained Analysis of Microservice Metrics with IPU-Hosted Sketches},author={Cornacchia, Alessandro and Benson, Theophilus A and Bilal, Muhammad and Canini, Marco},year={2026},booktitle={USENIX Symposium on Networked Systems Design and Implementation (NSDI)},}
MAESTRO: Multi-Agent Evaluation Suite for Testing, Reliability, and Observability
Tie Ma, Yixi Chen, Vaastav Anand, Alessandro Cornacchia, Amândio R Faustino, Guanheng Liu, Shan Zhang, Hongbin Luo, Suhaib A Fahmy, Zafar A Qazi, and 1 more author
@article{ma2026maestro,title={MAESTRO: Multi-Agent Evaluation Suite for Testing, Reliability, and Observability},author={Ma, Tie and Chen, Yixi and Anand, Vaastav and Cornacchia, Alessandro and Faustino, Am{\a}ndio R and Liu, Guanheng and Zhang, Shan and Luo, Hongbin and Fahmy, Suhaib A and Qazi, Zafar A and others},journal={arXiv preprint arXiv:2601.00481},year={2026},}
NetCompute
Opportunistic Telemetry Transport in Hardware-Accelerated Observability Pipelines
H.A. Chikh Dahmane, Alessandro Cornacchia, and Marco Canini
Workshop on Computation over Heterogeneous Networks, in conjuction with IEEE INFOCOM, 2026
High-frequency telemetry is crucial to understand performance and failures in microservice-based cloud systems, but exporting metrics at sub-second granularity can contend with application traffic and saturate shared I/O resources such as PCIe, even when collection is offloaded to SmartNICs or Infrastructure Processing Units (IPUs). Existing designs typically rely on dedicated telemetry packets or frequent RDMA reads, which introduce extra system calls, DMA operations, and PCIe transactions. This work-in-progress paper explores an alternative, event-driven design: opportunistically collecting and transporting host metrics by piggybacking them on sub-MTU application packets already destined for the network. We attach eBPF programs to the transmit path so that, when a sub-MTU packet is sent, the kernel can \emphreuse this I/O event to (1) read system metrics – thus amortizing the cost of context switches – and (2) embed telemetry before the packet traverses PCIe – thus avoiding new transactions for telemetry packets; the SmartNIC then strips and aggregates this metadata for downstream analysis. We sketch the architecture of this opportunistic telemetry transport, outline key challenges, and present an initial evaluation to quantify overheads in containerized microservice workloads.
@article{dahmane2026opportunistic,title={Opportunistic Telemetry Transport in Hardware-Accelerated Observability Pipelines},author={Dahmane, H.A. Chikh and Cornacchia, Alessandro and Canini, Marco},journal={Workshop on Computation over Heterogeneous Networks, in conjuction with IEEE INFOCOM},year={2026},}
ICLR TSALM
PETS: Inference-Time Differentially Private Synthetic Time Series Generation
Yangzhixin Luo, Haibo Wu, Alessandro Cornacchia, Chenxi Liu, and Marco Canini
In 1st ICLR Workshop on Time Series in the Age of Large Models, 2026
Existing methods for differentially private (DP) synthetic time series generation inject privacy during model training via DP-SGD, requiring private data in the training phase, expensive hyperparameter tuning, and costly retraining for new domains. We propose Private Evolution for Time Series (PETS), the first inference-time framework for DP synthetic time series generation via Private Evolution (PE). In this setting, private data are not used to train generative models, but only to guide the selection of synthetic outputs at inference time, to maximize fidelity and satisfy a privacy-budget constraint. Building on top of PE, we realize PETS through three specialized components: rule-based generation module, VAE-based structure-preserving variation module, and contrastive embeddings for similarity-driven selection. The framework is modular, enabling domain adaptation by swapping components with no retraining overhead. On the traffic benchmark (METR-LA) at ε=0.7, PETS achieves a C-FID of , reducing C-FID by 14x compared to the state-of-the-art method, and attains ≥27x lower forecasting RMSE, demonstrating strong utility–privacy trade-offs.
@inproceedings{luo2026pets,title={{PETS}: Inference-Time Differentially Private Synthetic Time Series Generation},author={Luo, Yangzhixin and Wu, Haibo and Cornacchia, Alessandro and Liu, Chenxi and Canini, Marco},booktitle={1st ICLR Workshop on Time Series in the Age of Large Models},year={2026},url={https://openreview.net/forum?id=ubsnu8YA0z},}
2025
ACM NGNO
Towards a Playground to Democratize Experimentation and Benchmarking of AI Agents for Network Troubleshooting
Zhihao Wang, Alessandro Cornacchia, Franco Galante, Carlo Centofanti, Alessio Sacco, and Dingde Jiang
In Proceedings of the 1st Workshop on Next-Generation Network Observability, 2025
@inproceedings{wang2025towards,title={Towards a Playground to Democratize Experimentation and Benchmarking of AI Agents for Network Troubleshooting},author={Wang, Zhihao and Cornacchia, Alessandro and Galante, Franco and Centofanti, Carlo and Sacco, Alessio and Jiang, Dingde},booktitle={Proceedings of the 1st Workshop on Next-Generation Network Observability},pages={1--3},year={2025},}
ACM SIGIR
Information Retrieval in the Age of Generative AI: The RGB Model
Michele Garetto, Alessandro Cornacchia, Franco Galante, Emilio Leonardi, Alessandro Nordio, and Alberto Tarable
In Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2025
The advent of Large Language Models (LLMs) and generative AI is fundamentally transforming information retrieval and processing on the Internet, bringing both great potential and significant concerns regarding content authenticity and reliability. This paper presents a novel quantitative approach to shed light on the complex information dynamics arising from the growing use of generative AI tools. Despite their significant impact on the digital ecosystem, these dynamics remain largely uncharted and poorly understood. We propose a stochastic model to characterize the generation, indexing, and dissemination of information in response to new topics. This scenario particularly challenges current LLMs, which often rely on real-time Retrieval-Augmented Generation (RAG) techniques to overcome their static knowledge limitations. Our findings suggest that the rapid pace of generative AI adoption, combined with increasing user reliance, can outpace human verification, escalating the risk of inaccurate information proliferation across digital resources. An in-depth analysis of Stack Exchange data confirms that high-quality answers inevitably require substantial time and human effort to emerge. This underscores the considerable risks associated with generating persuasive text in response to new questions and highlights the critical need for responsible development and deployment of future generative AI tools.
@inproceedings{garetto2025information,title={Information Retrieval in the Age of Generative AI: The RGB Model},author={Garetto, Michele and Cornacchia, Alessandro and Galante, Franco and Leonardi, Emilio and Nordio, Alessandro and Tarable, Alberto},booktitle={Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval},pages={602--612},year={2025},}
@article{wang2025chamaleonet,title={ChamaleoNet: Programmable Passive Probe for Enhanced Visibility on Erroneous Traffic},author={Wang, Zhihao and Cornacchia, Alessandro and Bianco, Andrea and Drago, Idilio and Giaccone, Paolo and Jiang, Dingde and Mellia, Marco},journal={arXiv preprint arXiv:2508.12496},year={2025},}
ACM APSys
Between Promise and Pain: The Reality of Automating Failure Analysis in Microservices with LLMs
Alessandro Cornacchia, Iliyas Alabdulaal, Ibraheem Saghier, Albaraa Mirdad, Omar Fayoumi, and Marco Canini
In Proceedings of the 16th ACM SIGOPS Asia-Pacific Workshop on Systems, 2025
Large Language Models (LLMs) are increasingly explored as general-purpose assistants for infrastructure operations, helping automate tasks like querying data, analyzing logs, and suggesting fixes. In this paper, we consider the more general and ambitious problem of fully automating root cause analysis (RCA) in microservice systems, where LLMs must collect information, reason about it, and interact with the environment to detect, localize and resolve issues. Anecdotal evidence offers useful insights and partial solutions, but the broader challenge remains unresolved. We systematically evaluate multiple LLM agent architectures across a range of incident scenarios. We study how different tool-augmented agents perform, and shed light on common failure modes, including hallucinated reasoning paths and inefficient use of context. Our findings reveal both the promise and the limitations of current approaches, and point to concrete directions for more robust and effective use of LLMs in this domain.
@inproceedings{cornacchia2025between,title={Between Promise and Pain: The Reality of Automating Failure Analysis in Microservices with LLMs},author={Cornacchia, Alessandro and Alabdulaal, Iliyas and Saghier, Ibraheem and Mirdad, Albaraa and Fayoumi, Omar and Canini, Marco},booktitle={Proceedings of the 16th ACM SIGOPS Asia-Pacific Workshop on Systems},pages={155--167},year={2025},}
ACM IMC
Poster: The Potential of Erroneous Outbound Traffic Analysis to Unveil Silent Internal Anomalies
Andrea Sordello, Zhihao Wang, Kai Huang, Alessandro Cornacchia, and Marco Mellia
In Proceedings of the 2025 ACM Internet Measurement Conference, 2025
@inproceedings{sordello2025poster,title={Poster: The Potential of Erroneous Outbound Traffic Analysis to Unveil Silent Internal Anomalies},author={Sordello, Andrea and Wang, Zhihao and Huang, Kai and Cornacchia, Alessandro and Mellia, Marco},booktitle={Proceedings of the 2025 ACM Internet Measurement Conference},pages={1078--1079},year={2025},}
@article{cornacchia2025dmas,title={DMAS-Forge: A Framework for Transparent Deployment of AI Applications as Distributed Systems},author={Cornacchia, Alessandro and Anand, Vaastav and Bilal, Muhammad and Qazi, Zafar and Canini, Marco},journal={arXiv preprint arXiv:2510.11872},year={2025},}
IEEE NOMS
Sharing GPUs and Programmable Switches in a Federated Testbed with SHARY
Stefano Salsano, Andrea Mayer, Paolo Lungaroni, Pierpaolo Loreti, Lorenzo Bracciale, Andrea Detti, Marco Orazi, Paolo Giaccone, Fulvio Risso, Alessandro Cornacchia, and 1 more author
In IEEE Network Operations and Management Symposium, 2025
@inproceedings{salsano2025sharing,title={Sharing GPUs and Programmable Switches in a Federated Testbed with SHARY},author={Salsano, Stefano and Mayer, Andrea and Lungaroni, Paolo and Loreti, Pierpaolo and Bracciale, Lorenzo and Detti, Andrea and Orazi, Marco and Giaccone, Paolo and Risso, Fulvio and Cornacchia, Alessandro and Chiasserini, Carla Fabiana},booktitle={IEEE Network Operations and Management Symposium},pages={1--5},year={2025},}
Agentic systems, powered by Large Language Models (LLMs), assist network engineers with network configuration synthesis and network troubleshooting tasks. For network troubleshooting, progress is hindered by the absence of standardized and accessible benchmarks for evaluating LLM agents in dynamic network settings at low operational effort. We present NIKA, the largest public benchmark to date for LLM-driven network incident diagnosis and troubleshooting. NIKA targets both domain experts and especially AI researchers alike, providing zero-effort replay of real-world network scenarios, and establishing well-defined agent-network interfaces for quick agent prototyping. NIKA comprises hundreds of curated network incidents, spanning five network scenarios, from data centers to ISP networks, and covers 54 representative network issues. Lastly, NIKA is modular and extensible by design, offering APIs to facilitate the integration of new network scenarios and failure cases. We evaluate state-of-the-art LLM agents on NIKA and find that while larger models succeed more often in detecting network issues, they still struggle to localize faults and identify root causes.
@article{wang2025network,title={A Network Arena for Benchmarking AI Agents on Network Troubleshooting},author={Wang, Zhihao and Cornacchia, Alessandro and Sacco, Alessio and Galante, Franco and Canini, Marco and Jiang, Dingde},journal={arXiv preprint arXiv:2512.16381},year={2025},}
2024
IEEE HPSR
A “Big-Spine” Abstraction: Flow Prioritization With Spatial Diversity in The Data Center Network
Alessandro Cornacchia, Andrea Bianco, Paolo Giaccone, and German Sviridov
In 2024 IEEE 25th International Conference on High Performance Switching and Routing (HPSR), 2024
Data center networks undergo the coexistence of latency-sensitive mice flows and bandwidth-intensive elephant flows. Jointly optimizing the performance of both traffic classes poses complex challenges. Existing flow schedulers either rely on detailed flow size information or require numerous physical priority queues (PQs) within network switches, thus facing practical challenges.In this work, we propose a novel flow scheduling algorithm, namely Multi-Path Multi-Level Feedback Queueing (MPMLFQ), to overcome these limitations. MP-MLFQ leverages the spatial diversity and regularity of DCNs to realize a scheduler with numerous logical priority levels while occupying as few as 2 physical PQs at each switch port. We designed MP-MLFQ to run atop modern programmable networks, and highlighted how to implement it without modifications at the end-hosts’ stacks. Our simulation results show that MP-MLFQ outperforms existing flow size-agnostic solutions in minimizing the flow completion time, when only two PQs are available.
@inproceedings{cornacchia2024big,title={A “Big-Spine” Abstraction: Flow Prioritization With Spatial Diversity in The Data Center Network},author={Cornacchia, Alessandro and Bianco, Andrea and Giaccone, Paolo and Sviridov, German},booktitle={2024 IEEE 25th International Conference on High Performance Switching and Routing (HPSR)},pages={43--48},year={2024},organization={IEEE},}
2023
ACM CoNEXT-SW
MicroView: Cloud-Native Observability with Temporal Precision
Alessandro Cornacchia, Theophilus A Benson, Muhammad Bilal, and Marco Canini
In Proceedings of the on CoNEXT Student Workshop, 2023
@inproceedings{cornacchia2023microview,title={MicroView: Cloud-Native Observability with Temporal Precision},author={Cornacchia, Alessandro and Benson, Theophilus A and Bilal, Muhammad and Canini, Marco},booktitle={Proceedings of the on CoNEXT Student Workshop},pages={7--8},year={2023},}
2022
ComCom
Staggered HLL: Near-continuous-time cardinality estimation with no overhead
Alessandro Cornacchia, Giuseppe Bianchi, Andrea Bianco, and Paolo Giaccone
@article{cornacchia2022staggered,title={Staggered HLL: Near-continuous-time cardinality estimation with no overhead},author={Cornacchia, Alessandro and Bianchi, Giuseppe and Bianco, Andrea and Giaccone, Paolo},journal={Computer Communications},volume={193},pages={168--175},year={2022},publisher={Elsevier},}
IEEE PEMWN
Designing Probabilistic Flow Counting over Sliding Windows
Alessandro Cornacchia, Giuseppe Bianchi, Andrea Bianco, and Paolo Giaccone
In 2022 IEEE 11th IFIP International Conference on Performance Evaluation and Modeling in Wireless and Wired Networks (PEMWN), 2022
@inproceedings{cornacchia2022designing,title={Designing Probabilistic Flow Counting over Sliding Windows},author={Cornacchia, Alessandro and Bianchi, Giuseppe and Bianco, Andrea and Giaccone, Paolo},booktitle={2022 IEEE 11th IFIP International Conference on Performance Evaluation and Modeling in Wireless and Wired Networks (PEMWN)},pages={1--6},year={2022},organization={IEEE},}
2021
IEEE MedComNet
A traffic-aware perspective on network disaggregated sketches
Alessandro Cornacchia, German Sviridov, Paolo Giaccone, and Andrea Bianco
In 2021 19th Mediterranean Communication and Computer Networking Conference (MedComNet), 2021
@inproceedings{cornacchia2021traffic,title={A traffic-aware perspective on network disaggregated sketches},author={Cornacchia, Alessandro and Sviridov, German and Giaccone, Paolo and Bianco, Andrea},booktitle={2021 19th Mediterranean Communication and Computer Networking Conference (MedComNet)},pages={1--4},year={2021},organization={IEEE},}