News & Analysis
NPU software spawns network architectures
Steve Klinger
2/5/2004 4:51 PM EST
There has been so much change in the communications industry during the past few years that it is tempting to focus on the truly revolutionary advances occurring at the systems and network application level. However, it also can be useful to periodically have a look behind the scenes to understand the key enabling technologies whose steady evolution is making the revolution possible.
One such enabling trend has been the development of advanced software tools and programming methods that make it possible for highly integrated network processor units (NPUs) to deliver on the promise of faster speeds, greater flexibility, lower costs and faster time-to-market. To a great degree, the evolution of NPU software development can be attributed to the emergence of advanced architectures for products like multiservice switches and multi-service-provisioning platforms.
The early evolution of network-processing technologies used three distinct approaches. The first was simply an attempt to build upon existing general-purpose processor architectures, such as RISC, but that model also limited the raw speeds and deterministic performance required of network functions.
To the forefront
The second used hardware-optimized state machines to drive performance, but inherent in such hardwired ASIC techniques was a lack of software programmability, which was needed to rapidly deploy new, emerging services and to enable the flexible provisioning of those services in disparate and dynamic network environments.
The third was to develop from the ground up new NPU architectures that bring together hardware optimization for network functions while providing full software programmability to implement any desired function.
As network performance requirements continued to escalate, the last approach quickly moved to the forefront as the most viable way to handle the higher speeds and increasing protocol diversity and complexity. NPU architectures using multiple embedded cores and integrated features such as configurable hardware-based traffic management have enabled new leaps forward in scalability and functionality. Still, the NPU software programming model and methodology has also needed to evolve in order to derive maximum results from the hardware features. As a result, today's advanced fifth-generation NPU architectures now rely upon a tightly integrated and well-balanced interaction between the underlying hardware and the software.
Single thread
With multicore NPU hardware optimized for parallel execution of network-specific tasks, it is critical to keep the core execution units fully utilized and to avoid wait states or wasted cycles. As the number of cores increases to provide performance scalability, these challenges can only be met through the use of an NPU architecture and programming methodology that enable system designers to treat the entire NPU function as a single processing thread that handles the start-to-finish processing of each datagram/packet.
These goals are best achieved by giving designers a logically unified programming environment with a single-stage, single-image model that allows cells or packets to be automatically distributed across multiple processing cores and to "run to completion" on the (dynamically) designated cores. This contrasts with some multistage models in which a portion of the processing algorithm is handled on each core, and the individual packets or cells must be handed off between different processors, raising the risks of delays and wait states whenever any one of the cores becomes overutilized. In the single-image approach, the same code is resident and visible to each task on each core, allowing the programmer to approach the design in the same way that he or she would for a single processor while leveraging all of the performance advantages of multiple cores.
The goal of the NPU software tools and code infrastructure is to enable the rapid integration of many existing standard networking protocols and the rapid development and deployment of new protocols and functions. Data-plane macro libraries can provide a logical bridge between the high-level programming environment that the designer desires and the optimized, deterministic low-level code that is required for optimal NPU performance.
These libraries abstract the underlying hardware structure and allow the programmer to work with C-language macro API calls, while the low-level run-time code is optimized for single-instruction implementation of critical functions. This enables the NPU to directly control tightly integrated coprocessors for packet transformation, classification and policing functions as well as queuing and scheduling. The combination of the single-stage, single-image programming model and the data-plane macro library's efficient use of hardware eliminates latency concerns and enables full utilization of the underlying processor capabilities.
In addition, because the programming interface is abstracted from the specific hardware implementation, this model minimizes the lines of code written by the developer and provides code portability and reusability when migrating from one design or NPU device to the next. It also provides a straightforward path to scale overall performance by adding processing cores without having to rewrite or repartition the code.
Product differentiation
Another factor speeding time-to-market for NPU-based designs is the availability of application source code libraries-such as ATM, Ethernet, frame relay, IPv4-IPv6, Martini, MPLS and VPLS-built on top of the data-plane macro libraries.
These libraries give developers a ready-made, standards-based foundation that enables them to focus their primary efforts on product differentiation rather than reinventing well-known protocols and capabilities. By integrating and simultaneously supporting a variety of protocols within a single NPU device, the soft provisioning of services on a per-port basis is enabled, maximizing the utilization of capacity in the end line card and equipment.
This benefits both equipment vendors and service providers. The equipment provider enjoys lower development costs, improved inventory management and reduced time-to-market for new capabilities. Service providers get faster and more flexible service provisioning, better equipment utilization and reduced operational costs by eliminating unnecessary truck rolls.
Another area in which NPU software has evolved is development environments and simulation tools. Besides leveraging high-level, C-based development environments and standards-based application libraries to ease the programming challenge, today's NPU software offerings include extensive simulation tools aimed at streamlining code development, debug and performance optimization.
These are not general-purpose CPU simulators, but rather targeted tools aimed at analyzing the processing and flow of networking datagrams as they stream through the NPU.
Such tools enable the user to simultaneously observe the contents of the frame that's being processed and the changes to registers and memory displayed in a context that relates them to one another. The ability to preset processing breakpoints and to highlight code coverage simplifies code debugging and improves quality.
Integrated performance analysis ensures that when the software is integrated on the real NPU device all performance targets are met. By decoupling the software design and code development from the availability of hardware, the overall development program schedule is optimized.
The bottom line is that today's high-performance multicore NPU hardware could not have evolved independently of the advanced source code software infrastructure and development tools that are the key to realizing the benefits offered by using NPUs. Taken together, they can enable system developers to minimize time-to-market, preserve their R&D investments, differentiate their product capabilities and provide a high degree of service deployment flexibility for customers.
Steve Klinger is staff field applications engineer at Applied Micro Circuits Corp. (San Diego).


See related chart
