Design Article
Achieving near-zero latency faster
Syed Saulat Hussain and Glen Young
8/1/2003 8:18 AM EDT
Whether virtual or not, a network demands high-availability for data and that data must be highly secure because it is regarded as an enterprises crown jewels. Network OEMs now provide an array of hardware and software-based network security. But these solutions are in the form of security software or a hardware VPN box that are expensive and require certain engineering expertise to make them work in an existing network infrastructure.
In their network security, OEMs use such methods as primary and secondary techniques. In cases like these, if the primary system crashes, then the secondary picks up in a few minutes or so. However, neither is very effective and does not achieve near zero latency. This becomes a problem for enterprise or IT managers who have to provide a secure network via the firewall, VPN, denial of service protection, performance and resiliency enterprises for effective secure remote sites and telecommuters.
Today, network system designers focus on their core product features like router bandwidth, reliability or server performance, reliability and availability or storage area networks (SANs) delivering solutions for IP storage applications. It would be a boon and highly beneficial if they could seamlessly add network security functions in existing solutions. This approach reduces their networks bottleneck, which, in some cases is due to software-based security. Or, at other times, security is based on primary/secondary techniques, which result in low-performance and limited scalability to handle high network traffic and high availability for all existing product deployment areas.
For example servers and network edge equipment have some similarities. But there are more differences as far as the data flow mechanism, central control scheme, and security packet processing over the security landscape. In these instances, however, hardware acceleration can provide the extra processing muscle these security functions demand.
To meet wire speed, data path throughput becomes more important in todays network systems, edge routers, content switches, and application servers. Thus, the policy-handling engine with sophisticated protocol software proves to be the performance bottleneck. If high availability is implemented, the extra overhead on the memory space, bus interface, and look-up monitor all work in tandem to slow up the system.
Network security based on VPN, secured socket layer (SSL), and Flow Classification also involves high-level protocol exchange and policy look-up. Existing solutions cannot even meet throughput requirement because they cost the main processor more than 50 percent bandwidth to handle just these functions. Without a high performance co-processing accelerator subsystem to handle these jobs, it is virtually impossible to meet the performance requirement. This is one of the reasons network equipments and servers are converging on a blade-based architecture. With plug-n-play feature blades, systems can be configured to meet feature and performance needs without having integration issues.
Adding duplicate blades to the system can also perform high availability. But the traditional way to handle the high availability is still to configure one as a primary processor and the other as secondary processor. Here, designers should take into consideration this particular issue. A primary system operating with a secondary snooping system configuration takes considerable time to recover when the system incurs a primary failure.
Maintaining high availability relies on the system control software stack to keep tracking all the sessions status, especially at the Transmission Control Protocol (TCP) layer and beyond, and to re-establish all the sessions for the secondary processor to take over. Gaps here are in milliseconds or seconds, depending on session types and volume.
The high availability requirement in network security is even more critical at system level implementation. During the re-establish cycles, data in security tunnels or sessions might be lost or hacked by unexpected intruders. Moreover, the data stream has to re-start from the place it drops. The high-level protocol engine needs to reassemble previous packets and re-connect to previous records to perform continued operation. How to shorten the gap between fail and over poses a key design issue for network system engineers.
The easiest way is to have a large enough look-up memory space to keep all the sessions current status. Primary and secondary processors share the same program indicator, which always points to the file that needs to be recovered. The upside is there is no need to have a software upgrade as well as a protocol update. However, a system bottleneck is created due to memory access speed and the memory bus throughput. In addition, the more fail over performance it needs, the more the hardware upgrade cost increases. More importantly, its still not the guaranteed solution due to system bus overhead.
VLAN configuration is another way to keep high availability performance in good shape. Here, multiple blades are configured to perform the same function as multiple virtual local area networks (VLANs). Once the connection fails on one VLAN, the host control system can switch the continuing sessions to the next VLAN. Meanwhile, IT personal can fix the problem without waiting for the secondary system to re-establish connections. However, the TCP layer protocol processor needs to repair current lost sessions. In this scenario, the particular sessions might be lost forever or have a long wake up time, plus they can be hacked during the down time. Worst case is the key information might be hacked as well.
The other way is to configure a concurrent VLAN scheme to perform the security function. The host needs to have robust arbitration capability to determine which VLAN has the most current sessions. On the data path, the host drops the obsolete VLAN result automatically and allows the most current one to pass through. However, the arbitration mechanism needs to be sufficiently reliable to handle the wire speed transaction. Otherwise, the whole fail over mechanism collapses with the non-fixable damage. Therefore, this concurrent VLAN solution can provide system- performed network security and high availability in wire speed, but there is a small possibility it can slow down the entire security network.
The most efficient and reliable mechanism to handle wire speed secure network operation should be flexible and manageable. For example, PaxcelNet architects the network security, VPN and SSL, in such a way to be highly manageable and reliable. More importantly, this network security accelerator even attains near zero latency fail over without adding extra overhead to the host system. First, the concurrent VLAN configuration performs no latency fail over recovery and provides 100 percent high availability by simultaneously configuring multiple VLAN operations.
The obsolete processed sessions are dropped automatically from any slow VLANs if the most current session from a VLAN has passed through. It performs a seamless fail over mechanism when one VLAN is failed or malfunctioned. From the host point of view, it always sees the result returning from the interface in between with the security subsystem, or acceleration blade, in real time without knowing anything goes wrong over the security processing.
To make sure the crucial fail over occurs, a bank of memory is implemented in the flow control device to hold all the packets with pass through labels (part of the packet header) that are sent to the security subsystem, which is configured as a VLAN in the entire system scheme. On the ingress path from the VLAN, the flow controller simply does the comparison of one pass through label in the processed security packet to ensure the flow through sequence.
If the flow controller detects the out of sequence packet flow, it indicates that the arbitration device is wrong and will re-send the packets starting from the out of sequence point. In this instance, the designer gets doubly insured for fail over when a VLAN configuration is used and packet status is kept in a flow control device.
Based on the ASIC approach in the flow control mechanism, fail over recovery is in the nanosecond range. Conversely, a network security accelerator, like PaxcelNet's multi-gigabit security subsystem, PAX2500, gives a host system the risk free and worry free VPN and SSL acceleration upgrade and provides near zero latency fail over mechanism to the host system to perform high availability.
The ideal result is to handle high availability in zero latency fail over rate. However, the entire network processing and switching power, especially control processor, should be powerful enough to meet the requirements. In todays modern blade-based architecture in network systems and various application servers, it is not too difficult to have all the compatible components operate in the same level to deliver the best performance.
Syed Saulat Hussain is Vice President, Marketing, and Glen Young is Director, Product Line at PaxcelNet, Inc. Fremont, Calif.



