Design Article
How to distribute multi-mode traffic flow management in multi-service networks
Vinoj Kumar, Product Architecture Manager, Chairman, NPF Software Interface Task Group, Agere Systems, Inc., Allentown, Penn.
5/9/2003 10:46 AM EDT
Today's networks that carrying voice, video and data traffic flows separately sit at a crossroads, and with the current economic climate, may remain in the crossroads for several years while the installed infrastructure migrates slowly but continuously towards new technologies to support new services.
Multimedia and data traffic are growing rapidly. In many ways, data and voice networks are converging. For example, with 3G wireless services, users are accessing the Web (data) using cell phones on a network designed for voice. Voice is being carried over IP data networks designed for best effort delivery. One benefit of converging voice and data into a single flow is that is allows users to create a centralized place to store messages. For example, people use separate phone, fax, e-mail, pager, and cell phone services, but retrieving messages from different places is time consuming and inconvenient. With VoIP, all of these means of communication can be centralized using a single message box.
With these benefits in mind, many carriers have their eye on a future based on an all-IP I/O and data movement intensive network. However, with the huge current infrastructure of Asynchronous Transfer Mode (ATM) networks in place, the transition to an all-IP future is likely to take years to happen. During the long transition phase, service providers need flexibility to provision different types of services and to address many variations in traffic mix. The key to this capability lies with multi-service routers and switches
Traditionally voice and video traffic have been carried over ATM networks while IP carried data traffic. ATM is a connection oriented communication paradigm that segments data into fixed size packets called cells and has built-in QoS capabilities. AAL (ATM Adaptation Layer) is a layer that sits above the ATM layer that is responsible for formatting and segmenting packets into fixed sized cells, a process called Segmentation and Reassembly (SAR). Because ATM is connection-oriented, it is suitable for implementing a bandwidth guarantee mechanism emulating that of the conventional circuit-switched network.
IP on the other hand is a connectionless technology with variable-sized packets where packet sizes can be as large 65,535 bytes. IP can be described as a best-effort delivery scheme with little or no traffic segregation and prioritization. In order to meet the delay, jitter, throughput and bandwidth guarantees required by time critical data such as voice and video, IP networks must support provisioned service classes. Schemes to support provisioned classes of service such as Differentiated Services (DiffServ) and traffic engineering scheme that enhance IP routing efficiency such as Multi-protocol Label Switching (MPLS) are being developed and standardized.
Delay sensitive variable sized packets means that network processors of the future must implement Traffic Management (TM) mechanisms that work for both cell as well as packet-based data. TM is broadly defined as schemes and mechanisms to implement QoS. Building such multi-mode traffic management into network processors introduces special challenges. For instance, scheduling algorithms designed for ATM are quite different from algorithms designed for IP networks. Differences in traffic management schemes can be best described by grouping the TM functions into three categories - policing, buffer management and scheduling.
Packet networks and ATM networks use separate schemes for traffic flow monitoring and policing. Because ATM uses fixed sized cells, a leaky bucket algorithm that releases tokens at a constant rate is used to implement policing/shaping. On the other hand, since packets can be of variable lengths, a token bucket scheme is used in IP networks. In this scheme tokens are accumulated in policed rates measured in bytes per second. A multi-service network processor must be able to support both types of policing/shaping mechanisms.
Heavy que support
ATM virtual circuits carry thousands of voice conversations. These flows need to be prioritized and shaped at the network edge while the cells are switched at the network core. A multi-service network processor should support at least 64k queues (more is preferable) to be able to support voice channels in a carrier environment. In addition to supporting large number of queues, the NP should support traffic isolation across these flows. Traffic isolation requires intelligent internal buffer management mechanisms to prevent a flow that violates a traffic contract from interfering with a totally separate and compliant flow.
While ATM traffic scheduling is fine-grained and flow-based, service differentiation with IP is much more coarse-grained and is referred to as Class-of-Service (CoS). Class-based scheduling schemes are used to implement DiffServ classes once the traffic has passed through the access points. Segregating data into various classes of service allows a carrier not only to prioritize traffic but also to offer newer differentiated services or Service Level Agreements (SLA). This means that a network processor must support not only a flow-level scheduler but also a class-level scheduler.
Whether the aggregation is at the flow level or class level, the algorithm of selecting a packet from a queue to transmit varies significantly between cell based ATM traffic and packet based IP. For example, a Fair Queuing (FQ) algorithm that allocates equal bandwidth to all flows works well for fixed-sized cells but does not work well for variable-sized packets. Weighted Round Robin (WRR), typically used to allocate coarse-grained bandwidth also suffers from similar fairness problems. Weighted Fair Queuing (WFQ), and Smooth Deficit Weighted Round Robin (SDWRR) are some variations of these algorithms that use a bit-by-bit concept to transmit packets as opposed to whole packets that address the limitations of cell-based algorithms. Since we cannot schedule at the bit level, network processors typically approximate this behavior and implement the algorithms in hardware. Implementing such diverse schemes in hardware is quite a challenge.
Given the large number of flows and the deep hierarchy of scheduler levels, an all-software implementation of ATM traffic management schemes will not scale. A Policing, Shaping, Buffer Management, and Scheduling approach using programmable VLIW engines is one possible architectural solution.
The other major issue in multi-service platforms is maintaining QoS across the entire system that involves line cards and fabric cards. A switch fabric is defined as a chip that switches various types of traffic. It is not sufficient for a network processor acting alone to implement traffic management mechanisms; the functions need to be distributed across network processors and switch fabric processors. This leads to distributed traffic management architecture.
With a distributed traffic flow management design, traffic management is accomplished at two levels - by the network processors residing on multiple line cards and by switch fabrics residing on fabric cards that provide the back plane connectivity between line cards. Unless the traffic management is implemented in both places, it is not possible to achieve the true system-level QoS required by multi-service networks. Partitioning these traffic management functions and the interaction between the network processor and the switch fabric has a huge bearing on how QoS is perceived at the system level.
Because the switch fabric is also a queuing device, designers need to consider traffic management in the fabric as well. A switch fabric needs to be protocol independent to support the various types of traffic types. Space and power are important considerations for multi-service routers. If separate fabric processors are designed for each type of traffic then the number of fabric cards in a switch increases, thereby increasing the space and power constraints. In addition to being protocol independent, the switch fabric must be able to provide traffic isolation between flows. The fabric should also guarantee bandwidth to certain flows or traffic types - this is in fact often the basis of service level agreements (SLAs), which are important revenue streams for service providers.
And just as the network processor must be able to identify priority traffic and react accordingly, so too must the switch fabric otherwise the higher priority traffic loses its distinction when it exits the line card. Switch fabrics are now also being designed to be work conserving the ability to use bandwidth that is allocated to other streams, flows, or traffic types when that bandwidth is sitting idle. This is an important revenue enhancer and resource efficiency boost for any service provider.
Scaling capacity
Scalability is an important requirement for multi-service equipment. For a switch/router to scale, the switch fabric must also scale both in capacity (throughput and port density) as well as the switch/router physical design. For multi-service networks, fabric chip data rates must range from at least OC-12 to OC-768 with a supported total system throughput of 2.5Tbit/sec. System scalability means that as the carrier's bandwidth requirement changes, the carrier should be able to add/remove line cards and fabric cards without having to redesign the switch.
To support system scalability, the fabric chipset must allow a system designer to make tradeoffs between low entry-cost systems with low back plane connections against higher entry cost but future upgradeable systems. One possible architectural solution to meet these types of scalability requirements is to design a fabric as a chipset comprising of a queue scheduling chip and a switching chip (crossbar). Having such a physical partitioning enables a flexible system design such as centralized or distributed fabric configuration.
In a centralized fabric chipset configuration, both the queue scheduler and the switching (crossbar) chips of the fabric chipset reside on fabric cards. In a distributed configuration, the queue scheduler component of the fabric chipset resides on the line card while the crossbar chip resides on the fabric card. While a centralized configuration offers a low cost solution, the distributed configuration allows a system to be shipped with fewer line cards and fabric cards and upgraded with more line and fabric cards as the demand grows.
Given that a switch fabric is a queuing device, the switch fabric can experience congestion, especially if it receives data at the input ports faster than it can dispatch the data at the output ports. In this case, it is important for the fabric to tell the network processor to slow down a process called "backpressure." Multiple backpressure levels are in fact desirable in a multi-service network: at the port level if an I/O port cannot handle the traffic assigned to it; at the queue level for flow level rate control; and at the device level if a device in the system becomes overloaded for any reason.
In addition to supporting traffic management functions, switch fabrics need to work seamlessly with network processors for a true system level QoS. This requires that both the NP and switch fabric implement an efficient backpressure scheme characterized by low latency and low bandwidth utilization. Programmability and the high data rates required also suggest that a hardware implementation of the basic mechanisms is desirable. The partitioning of the traffic management functions across a switch fabric and NPs will dictate the efficiency and the scalability of the ultimate solution.
With all these design issues underlying multi-service applications, NPs are replacing fixed function ASICs by offering programming flexibility while maintaining performance levels. Extending the life of ATM networks and migrating packet networks for time-sensitive voice and video traffic requires a switch/router to support multiple protocols. This in turn requires network processors to support SARing functions; flow and class based queue-scheduling schemes, and distributed traffic management across network processors and switch fabrics. A network processor without such features is not suitable for a multi-service network.



