Design Article

Getting the most packet processing throughput per application flow

Josh Cochin, Hardware Engineering Manager, Steve Kohalmi, Chief Systems Architect, Quarry Technologies, Inc., Burlington, Mass.

5/9/2003 11:01 AM EDT

Getting the most packet processing throughput per application flow

The demand for advanced services, such as Internet Protocol Virtual Private Networks (IP VPNs), application-level Quality of Service (QoS), data encryption, managed firewalls, and address translation, has not subsided, and many customers have expressed a willingness to pay premiums for such special treatment of their important network traffic.

Unfortunately, to date, the trials of service-enabling systems haven't gone as well as hoped. Most cannot handle the concurrent service mix and high-touch packet-processing throughput required by large numbers of higher-bandwidth business users who need to be sure their time-sensitive and mission-critical applications always receive appropriate precedence and security.

To solve this problem, a new architecture for the next-generation IP service edge switch is emerging which uses high-speed ASICs in combination with programmable network processors to yield maximum packet processing performance and feature flexibility. Interestingly, it is not focused onswitching bandwidth or interface speeds, but instead on what really matters tosupport the business needs of service providers' customers - maximum packet processing throughput per application flow, without any compromises.

By distinctly identifying subscriber traffic and classifying application flows at the network boundary, the IP service edge switch is able to apply QoS and encryption algorithms correctly, and to aggregate flows prior to sending the traffic on to the shared IP backbone. Attempting to initiate QoS or security techniques elsewhere in the network could be futile, since the traffic would first have to cross the backbone in the clear where it might be subject to congestion and security breaches.

The real challenge for the IP service edge switch is the ability to apply all required services concurrently to even the largest application flows, or class-based aggregate flows, without introducing performance-impacting latency or jitter. Additionally, because these services must be managed and ultimately billed for, detailed statistics collection is required. And, because the switch will be widely deployed in local communities, it must deliver all these capabilities at a reasonable cost.

Such uncompromising performance is only possible if the IP service edge switch is built upon an architecture that is up to the challenge. Service providers looking to attract and retain lucrative business customers using advanced IP services will want to understand what makes these essential capabilities possible.

In general, an IP service edge switch receives data on its input interfaces, processes it, and as part of this processing determines the appropriate output interfaces, then sends it out those particular interfaces. Because the switch is designed for the edge of the network, this data-handling flow is not usually symmetrical.

On ingress to the network, many thousands of subscribers' data flows are aggregated onto a few backbone links. While the backbone links are quite large, it is still possible for them to be oversubscribed, and therefore it is important that subscriber traffic is properly aggregated according to relative priority of the type of the traffic.

In the opposite direction, data arriving from the network backbone has to be channeled into small subscriber access links. As diverse traffic flows are transmitted onto individual subscriber links that are much smaller in throughput than the core-side links, they have to be properly prioritized and controlled.

This aggregation of data flows requires a switch and packet processing architecture that can handle individual flows just as well as it handles aggregated flows, that can differentiate between different traffic types, properly manage traffic aggregation, and account for all data passing through the switch. Since the switch is deployed in a critical point of the network, it must also provide reliable operation even in case of failures.

Many architectural choices must be made around how I/O, packet processing, and traffic manager modules are connected together. For reference, the most basic configuration would comprise one input, one packet processor, one traffic manager, and one output. A much more flexible and interesting system would have several of each of these components.

But before such a system can be built, two issues must be settled. First, the size of each component must be decided. Each I/O module obviously has to be sized in accordance with the particular media it services. Each packet processing module has to be able, at a minimum, to handle all the processing requirements of the largest single traffic flow. Since the largest traffic flow can be as large as the largest I/O port in the system, packet-processing bandwidth has to be sized accordingly.

The second issue to be decided is how a traffic flow should be aggregated relative to all other flows. This aggregation decision must be made after packet processing has been completed. It is the foundation of the QoS treatment applied to a traffic flow. The traffic-manager module is responsible for this aggregation.

In a perfect system, all data destined to an output port would be received in the output port's traffic manager where QoS and flow aggregation decisions are made. The basic reference architecture therefore places the packet processor on the input side of the switch and the traffic manager on the output side. Physically locating packet processing and traffic management together with the I/O module may seem like a good idea until the cost ramifications are examined.

This is because there is a huge difference in the bandwidth requirements of I/O modules for an IP service edge switch ranging from low-speed access ( T1, DS3) to high-speed core-side interfaces (OC-12, Gigabit Ethernet). While one could use the same packet processing and traffic management hardware for all I/O modules, the cost to support lower-speed interfaces would prove to be prohibitively high. Conversely, optimizing the packet-processing and traffic- management designs for each I/O module type is not practical from the engineering resource and support aspects.

Sharing resources

A better solution to the design of an IP service edge switch is to make packet processing and traffic management separate resources, which may be shared among multiple I/O modules. This has the advantage of makingefficient use of development and system resources, improving scalability, and facilitating system redundancy. Additionally, packet processing and traffic management are inherently separate processes (one being ingress and the other egress). As such, placing a switch fabric between them makes sense as well.

While the I/O, switch-fabric, and traffic-management modules are important components of the IP service edge switch, the real heart and brains of the system is in the packet-processing module. It is here that careful attention to design and optimization yield the greatest benefit.

The first generation IP services routers employed general-purpose processors to handle traffic. The bandwidth of each of these individual processors is limited to around 30 to 100 Mbps of packet throughput depending on the services performed, and data flow striping cannot be used to increase the throughput per flow, due to packet sequencing issues.

Two major problems with this approach should be immediately obvious: each processor is limited to much less than the potential bandwidth of an individual traffic flow; and even this meager throughput is dependent on the types and quantities of services that need to be performed. Advanced IP services can only be effectively applied if neither of those restrictions is present. As a result, a packet-processing module must be able to handle the full bandwidth of the maximum size flow regardless of the processing complexity.

The best way to build a packet-processing module is to use a pipeline of network processors and assist them in the execution of special functions using ASICs and other specialty chips. In this way the IP service edge switch is able to provide unmatched speed and flexibility while also avoiding sequencing issues of parallel processing systems. Multiple pipelines can then be interconnected via the switch fabric to provide redundancy, load sharing and system scalability.

The network processors can receive and transmit the full bandwidth of the largest port in the system, then distribute this traffic to ASICs and off-the-shelf coprocessors. Thus, the network processor pipeline effectively provides sufficient bandwidth to and from these ASICs and coprocessors far exceeding the actual data throughput.

The ASICs and coprocessors provide specialty processing functions. They are lookup, database, search, and encryption engines, providing adjunct processing to the network processors. All packet manipulation decisions lie with the network processors. This division of labor improves system flexibility, and allows for future growth and evolution of services.





Please sign in to post comment

Navigate to related information

Datasheets.com Parts Search

185 million searchable parts
(please enter a part number or hit search to begin)

Feedback Form