News & Analysis
Net processor startup takes pipelined path to 40 Gbits/s
Anthony Cataldo
7/2/2001 12:43 PM EDT
SAN JOSE, Calif. Swedish startup Xelerated Packet Devices said it will defy conventional wisdom by introducing a fully pipelined network processor that operates at 40-Gbit/second wire speed by next year.
While most network processor vendors are using multiple processing engines to divide the packet-processing labor, Xelerated believes it can do away with the programming baggage that restricts multiprocessor architectures with a programmable pipeline architecture it calls the Packet Instruction Set Computer (PISC).
One of the hallmarks of the architecture is that it can handle packets as they come in, which means that it doesn't have to contend with packet reordering that often strains network processor performance. Under this scheme, the Stockholm-based company says PISC can be programmed to do tasks like filtering, fine-grained traffic conditioning, link sharing, tunneling, encapsulation/de-capsulation and shaping simultaneously.
"We've recognized that the packet stream is in layers. What you would like is to do classification on Layer 2, and on Layer 3 do classification with some action and so forth, throughout the packet. This is exactly what we have," said Thomas Eklund, cofounder of Xelerated Packet Devices.
At every stage in the pipeline, the PISC performs a classification and action operation to get wire-speed performance, the company claims. This is possible because each PISC stage acts as a mini-processor, with the ability to do things like arithmetic logic unit instructions and branches. There are 10 stages in the pipeline, which today clocks at 240 MHz. If more stages are needed, devices can be cascaded, Eklund said.
One of the trade-offs of this approach, Eklund said, is a fixed delay in the pipeline. But he said these delays "are in microseconds," and having deterministic performance makes up for the deficiency. "It's better to have a fixed delay in the pipeline than to have jitter characteristics," he said.
As it is, the device is able to handle packets coming in at 40 Gbits/s, but it can scale linearly up to 160 Gbits/s using 0.13-micron design rules. The architecture leans heavily on the additional performance afforded by this latest chip-processing technology, which is in the early stage of rollout at some foundries. "This architecture was not possible to do until 0.13 micron," Eklund.
While network processors use multiple RISC cores to enable some level of programmability, which isn't possible with fixed ASICs, Xelerated argues that balancing the processing load among multiple RISC engines is a headache for developers and ends up throttling performance. "If you have 16 processor cores and then decide to scale up, you have to redo your all your lower-level code," Eklund said.
But while many NPU vendors say there is a real need for C compilers to ease the programming burden, Xelerated believes this could put too much of a damper on performance.
"In a sequential programming model, if you put a lot of wrappers around it and have a compiler you will lose some of the utilization of the pipeline performance," Eklund said. "We have a very easy assembly language, which is pretty straightforward, and our customer feedback is telling us that it's easy to use."
So far, Xelerated has demonstrated several stages of its PISC architecture on an FPGA, and plans to test its first sample devices out of Taiwan Semiconductor Manufacturing Co. next April. The company plans to initially provide a packet processor, dubbed X40, and a traffic manager, the T40. To get to 40 Gbits/s, the network processors will work with CAM engines running at 166 MHz, Eklund said.



