Design Article
Clusters and IP make SANs go round
Hans Rauscher
9/5/2003 10:51 AM EDT
Storage-area networks are now firmly established in enterprise IT. The demands for availability, performance and data protection while reducing IT costs are higher than ever. How can fiber-based clusters and Internet Protocol storage specifications help to deliver to such requirements?
The huge potential of fiber can be better leveraged with IP-based storage protocols like iSCSI, both within local SANs and across wide-area networks. Clustering storage appliances with virtualization provides one step for SANs toward an ideal enterprise storage tank. With the right software platform, development of SAN components can come back under control again even in a multiprotocol, multi-interface world.
Today's SANs typically rely on Fibre Channel (FC) as the technology to connect servers and storage. Besides its maturity, FC has limitations that hinder SANs in becoming the ideal enterprise storage tank.
To keep up with data growth, storage needs to exchange data with servers faster than yesterday. Fibre Channel's 2-Gbit/second speed delivers almost 200 Mbytes/s of data. This lets you copy the content of a full CD-ROM within 3.2 seconds between two FC devices. On the other side of the network fence, IEEE released its 802.3ae specification allowing Ethernet frames traveling at 10 Gbits/s over the same fiber. And IEEE has an active working group on the next technology step, be it 40 or even 100 Gbits/s. To leverage Ethernet with its bandwidth-advantage IP-based storage protocols are needed.
How far away is safe enough?
FC uses both copper and fiber as the physical transport media. Over optical fiber, FC is limited to 10 kilometers per standard definition. That distance may be large enough between two data centers to overcome local damages, but it is not remote enough for earthquakes or larger power outages like this year's Aug. 14 blackout in parts of the Northeast and Canada. Help is available with special solutions for up to 40 km or with dense wavelength division multiplexing (DWDM) enhancing the 10 km by a factor of 12, or 120 km. Beyond that, IP-based storage protocols are needed to overcome that distance limitation. In addition, storage service quality needs to be reduced from synchronous to asynchronous updates to not slow down storage write performance.
Fibre Channel over TCP/IP (FCIP) is one out of three IP storage protocols. FCIP allows Fibre Channel to use IP backbones for connecting SANs by encapsulating FC frames in TCP segments. Those segments can be sent over any media that runs IP, and over larger distances than native FC. On the other end of the network tunnel, the TCP header is stripped off and the FC frame is sent as it would have originated within that FC SAN. FCIP requires some setup efforts to define and establish TCP/IP tunnels between the FC SANs. It relies on TCP/IP and FC for data-loss detection and recovery as well as for error detection. Typically, FCIP links FC SANs in different data centers for asynchronous backup, mirroring and remote vaulting together.
Like FCIP, Internet Fibre Channel Protocol (iFCP) is an evolutionary step for Fibre Channel. iFCP uses only the topmost layer (FC-4) from FC and replaces FC-0 to FC-3 with Ethernet and TCP/IP. A SAN gateway terminates the FC connection and converts it into a TCP/IP session. The receiving SAN gateway takes the FC-4 data and initiates a FC session. An iFCP session can be established between devices, between SANs and between a device and a SAN. iFCP is found replacing FC switches with Ethernet/IP switches in SANs and leveraging the company's IP backbone network.
The Internet Engineering Task Force IP Storage (IETF ips) working group has published draft standards for FCIP and iFCP. Both drafts expired earlier this year, with ips working on the standard track. Despite that, the market already has prestandard FCIP and iFCP solutions available. Internet Small Computer System Interface (iSCSI) on the other side is an official IETF RFC 3347, a so-called standard.
iSCSI uses SCSI as a device protocol and sends it via TCP/IP between servers (iSCSI Initiators) and storage devices (iSCSI targets). This is similar to FC, also iSCSI and FC-4 are not interchangeable. For addressing a storage device iSCSI uses a qualified name like a Web address. An Internet Simple Name Service (iSNS) server resolves that name to an IP address once and returns that information to the iSCSI Initiator. Any further communication for that session between the iSCSI initiator and target is based on an IP address instead of iSCSI names. This resolving mechanism provides great flexibility for maintenance and restructuring of iSCSI storage and it allows for simple but effective load balancing between iSCSI targets. iSCSI addresses the mass market and promises lower prices than Fibre Channel for SAN components. That allows SANs to be used in medium enterprises with smaller IT budgets, a market not well-addressed by FC in the past.
Like a rainbow
Beside their different approaches, all three storage protocols rely on TCP/IP. This allows them to use any IP network. With today's "everything over IP IP over everything" mantra, IP can be run on very different link layers. In LANs Ethernet at Gigabit speed is the de facto standard. For connections within a metropolitan area like the San Francisco Bay area or New York, service providers offer a choice of medium- to high-speed connections. Frame Relay and ATM are both common offerings; also they are being replaced by IPv4, MPLS and Ethernet. Dark fiber, DWDM and Sonet/SDH are also found in metro area networks (MANs); also they are more common in wide area networks (WANs).
All of the connections use fiber as their physical transport media. With the latest developments in fiber technology both bandwidth and distance are no longer technical limitations. Each single fiber can transport a multitude of different wavelengths (also called "colors"), each providing a bandwidth between 2.5 and 40 Gbits/s. Long-haul lasers can send wavelengths across several hundred km and up to 4,000 km without regeneration. Top-notch optical systems provide 160 or 320 wavelengths with 10-Gbit/s each, up to 2,000 km. This lets providers make efficient use of their dark fiber networks, leading to reduced prices and enabling more customers to use high-speed connections at least in theory.
Depending on a customer's bandwidth and distance needs he may find one connection method more appealing than the others. For FC users, dark fiber or DWDM may be a good choice to connect their SAN islands with the native FC protocol. For any of the IP storage protocols mentioned above IPv4, Ethernet or MPLS may be a good choice. A 10-Gbit Ethernet connection has an OC-192c-compatible WAN option making it a preferred selection for Sonet/SDH customers. At the end, budget and skilled IT personnel need to be taken into account, making the right decision for an affordable and appropriate MAN or WAN connection type.
Approaching 100 percent
Independent of the connection type, an ideal SAN is an enterprise-wide data storage tank, 99.999 percent available, that automatically reconfigures to changing application storage needs. It allows data storage consolidation across the company and storage virtualization across different storage devices.
Cluster systems provide a good platform to deliver to such requirements and have a proven track record in providing high-available services with load-balancing features. It's common to have a server cluster providing an application service that is transparent for the users
In SAN environments, clustering is the holy grail of the data center. All that is needed is to insure no single point of failure, and some virtualization of the data storage. Virtualization has been done inside disks for the last 15 years by abstracting from physical storage blocks and using logical storage blocks instead. Within file systems virtualization has been done for the last 30 years. Storage subsystems like RAID won't work without abstraction or virtualization of physical storage.
Different storage devices like FC appliances or iSCSI targets need to be part of a virtual storage pool that's transparent for the host OS and its file system. This would allow for easier management and better storage utilization.
A virtualization appliance does all the abstraction work by mapping virtual to physical storage. Depending on the position of this application within the SAN it's called in-band or out-of-band virtualization. In-band virtualization means that the virtualization appliance resides within the data path between storage and host. With out-of-band virtualization this appliance is connected to the SAN like another host/server. In-band is transparent to the servers and therefore more or less plug-and-play. Out-of-band requires client software to be installed on the hosts/servers. There is no recommendation for one or the other, since both have different advantages and disadvantages.
The next evolution step in storage networking will be intelligent storage switches. They are true integrators by offering different interfaces like native FC, Ethernet, SCSI and a variety of storage protocols FC, FCIP, iFCP or iSCSI. For IBM mainframe environments Escon and Ficon interfaces are a requirement too. That allows different server and storage devices to be interconnected. Intelligent storage switches also run management software to control storage devices and off-load specific tasks like remote copies or backups from servers. Next-generation storage switches also integrate MAN/WAN interfaces to remove additional network devices and simplify both storage network concepts and management.
Across all the trends in storage-area networking one simple statement is true: SANs are not a homogenous, closed environment anymore. With that, development of any SAN component becomes increasingly complex. Take a host bus adapter (HBA) as an example. An HBA sits inside a server and connects it to the SAN. FC HBAs are based on an ASIC design without much room for changes or software. Given any of the IP-based storage protocols, a full-blown TCP/IP stack needs to run as well as the special storage protocol on top of it.
To deliver gigabit wire-speed performance and run an application or database, the TCP/IP stack needs to be off-loaded from the server CPU to the HBA. The HBA is called a TCP/IP off-loading engine (TOE). Besides highly skilled engineers, making off-loading work requires the right development tools. A proven RTOS along with efficient development tools, like Wind Rivers VxWorks and Tornado IDE, helps to concentrate on the code that differentiates the product from competition.
Hans J. Rauscher is System Architect for Networking at Wind River Systems Inc. (Alameda, Calif.). He has more than 12 years' experience as a consultant, project manager and freelance author in the IT and communication sectors.



