Azure Operator Nexus is the next generation hybrid cloud platform designed for communications service providers (CSP). Azure Operator Nexus provides network functions (NFs) in various network settings, e.g. B. in the cloud and at the edge. These NFs can perform a wide range of tasks, from classic tasks such as Layer 4 load balancers, firewalls, network address translations (NATs) and 5G user plane functions (UPF) to more advanced functions such as deep packet inspection and Radio access networks and analysis. Given the large volume of traffic and concurrent data streams that NFs manage, their performance and scalability are critical to maintaining smooth network operations.
Until recently, network operators had two different options when implementing these critical NFs. One uses standalone hardware middlebox appliances and two use Network Function Virtualization (NFV) to deploy on a cluster of standard CPU servers.
Deciding between these options depends on a variety of factors – including the performance, storage capacity, cost and energy efficiency of each option – all of which must be weighed against their specific workloads and operating conditions such as the traffic rate and the number of concurrent data streams NF instances must have can deal with it.
Our analysis shows that the CPU server-based approach typically outperforms proprietary middleboxes in terms of cost-effectiveness, scalability, and flexibility. This is an effective strategy when traffic volumes are relatively low, as it can easily handle loads of less than hundreds of Gbps. However, as traffic volumes increase, the strategy falters and requires more CPU cores dedicated exclusively to network functions.
In-Network Computing: A New Paradigm
At Microsoft, we’ve been working on an innovative approach that has captured the interest of both industry personnel and academia – namely deploying NFs on programmable switches and network interface cards (NIC). This change has been made possible by significant advances in high-performance programmable networking devices, as well as the development of data-level programming languages such as Programming Protocol-Independent (P4) and network programming language (NPL). For example, programmable switching application-specific integrated circuits (ASIC) provide a degree of programmability at the data plane while ensuring robust packet processing rates – up to tens of Tbit/s or a few billion packets per second. Likewise, programmable network interface cards (NICs) or “intelligent NICs” equipped with Network Processing Units (NPU) or Field Programmable Gate Arrays (FPGA) offer a similar opportunity. Essentially, these advances transform the data layers of these devices into programmable platforms.
This technological advancement has created a new computing paradigm called “ In-network computing. This allows us to perform a number of functions that were previously the work of CPU servers or proprietary hardware devices directly on network data plane devices. This includes not only NFs, but also components from other distributed systems. With in-network computing, network engineers can implement various NFs on programmable switches or NICs, enabling large amounts of data (e.g. > 10 Tbit/s) to be handled cost-effectively (e.g. one programmable switch versus dozens of servers). , without having to specifically reserve CPU cores for network functions.
Current limitations in in-network computing
Despite the attractive potential of in-network computing, its full realization in practical deployments in the cloud and at the edge remains elusive. The main challenge was to effectively handle the demanding workloads of stateful applications on a programmable data plane device. While the current approach is suitable for running a single program with fixed, small workloads, it significantly limits the broader potential of in-network computing.
There is a significant gap between the evolving needs of network operators and application developers and the current, somewhat limited view of in-network computing, largely due to a lack of resource elasticity. As the number of potential concurrent applications on the network increases and the volume of traffic that needs to be processed increases, the model becomes stressed. Currently, a single program can run on a single device under strict resource constraints, such as tens of MB of SRAM on a programmable switch. Expanding these limitations typically requires significant hardware changes. That is, if an application’s workload requirements exceed the limited resource capacity of a single device, the application will not function. This limitation, in turn, hinders the wider adoption and optimization of in-network computing.
Bring resource elasticity to in-network computing
In response to the fundamental challenge of resource limitations in in-network computing, we set out to make this possible Resource elasticity. Our primary focus is on in-switch applications – those running on programmable switches – which currently face the most stringent resource and performance limitations among today’s programmable data plane devices. Instead of proposing hardware-intensive solutions like improving switch ASICs or building hyper-optimized applications, we’re exploring a more pragmatic alternative: a Rack resource expansion architecture.
In this model, we envision a deployment that integrates a programmable switch with other data plane devices, such as: B. intelligent NICs and software switches running on CPU servers all connected in the same rack. The external devices provide a cost-effective and incremental way to scale the effective capacity of a programmable network to meet future workload demands. This approach offers an intriguing and viable solution to the current limitations of in-network computing.
In 2020 we introduced a novel system architecture called Table Extension Architecture (TEA)at the ACM SIGCOMM conference.1 TEA innovatively provides elastic storage through a powerful virtual storage abstraction. This allows programmable top-of-rack (ToR) switches to handle NFs with a large state in tables, for example a million table entries per flow. These can take up hundreds of megabytes of storage, an amount not typically available on switches. The ingenious innovation behind TEA lies in its ability to allow switches to access unused DRAM on CPU servers within the same rack in a cost-effective and scalable manner. This is achieved through the clever use of RDMA (Remote Direct Memory Access) technology, which provides application developers with only high-level APIs (Application Programming Interfaces) while hiding complexities.
Our evaluations with various NFs show that TEA can deliver low and predictable latency along with scalable throughput for table lookups without ever involving the servers’ CPUs. This innovative architecture has attracted widespread attention from academic and industry members and has found its application in various use cases including network telemetry and 5G user plane functions.
We presented in April ExoPlane at the USENIX Symposium on Networked Systems Design and Implementation (NSDI).2 ExoPlane is an operating system specifically designed to expand rack switch resources to support multiple concurrent applications.
ExoPlane’s design incorporates a practical runtime operating model and state abstraction to address the challenge of effectively managing application states across multiple devices with minimal performance and resource overhead. The operating system consists of two main components: the scheduler and the runtime environment. The scheduler accepts multiple programs written for a switch with minimal or no modifications and optimally allocates resources to each application based on input from network operators and developers. The ExoPlane runtime then runs workloads across the switch and external devices, efficiently managing state, balancing loads across devices, and handling device failures. Our evaluation shows that ExoPlane provides low latency, scalable throughput, and fast failover while maintaining minimal resource footprints and requiring little or no changes to applications.
Outlook: The future of in-network computing
As we continue to explore the frontiers of in-network computing, we see a future full of possibilities, exciting research directions, and new deployments in production environments. Our current efforts with TEA and ExoPlane have shown us what is possible with on-rack resource expansion and elastic in-network computing. We believe they can be a practical foundation for enabling in-network computing for future applications, telecommunications workloads and new hardware at the data plane. As always, the ever-evolving landscape of connected systems will continue to present new challenges and opportunities. At Microsoft, we intensively investigate, invent and illuminate such technological advancements through infrastructure improvements. In-network computing frees up CPU cores, resulting in lower costs, greater scalability and improved functionality that telecom operators benefit from through our innovative products such as: B. can benefit Azure Operator Nexus.
- TEA: Enabling state-intensive network functions on programmable switches, ACM SIGCOMM 2020 https://dl.acm.org/doi/10.1145/3387514.3405855
- ExoPlane: An Operating System for Resource Expansion of On-Rack Switches, USENIX NSDI 2023 https://www.usenix.org/conference/nsdi23/presentation/kim-daehyeok