Building IP-centric media data centers

Mar 1, 2010 12:00 PM, By Luc Andries

Priority flow control-enable storage clusters eliminate traffic interference and support 100 percent efficiency.

    
Figure 1. Oversubscription in a media storage environment

Figure 1. Oversubscription in a media storage environment
Select figure to enlarge.

As broadcasters transition to file-based media production, large disk-based storage systems are becoming the fundamental media service of the production architecture. However, media traffic presents much more rigorous throughput requirements than classical IT solutions. Storage components must handle gigabyte-size files, large chunks of data in one I/O (typically up to 4MB) and continuous streams of traffic bursts over the storage network.

To increase throughput, media storage solutions distribute, or stripe, data over several distinct storage systems. Because every server needs parallel access to every storage system, media storage often relies on storage cluster concepts typically used in high-performance computing (HPC). These clusters employ a large number of devices, leading to complex storage network architectures. However, while HPC clusters typically exchange mostly small messages, media networks are continuously loaded to full capacity, leading to network congestion and sustained oversubscription of the switch ports.

These circumstances present a significant challenge: How can network engineers design a scalable storage network that can sustain the continuous throughput required by file-based media production while maintaining high efficiency and network use? As this article describes, the most significant barrier is traffic interference. Previously, VRT-medialab demonstrated that a storage cluster architecture employing Cisco's Data Center Bridging (DCB) technology and the PAUSE frame mechanism defined in the IEEE 802.3x standard can achieve higher link bandwidth use and scalability than traditional InfiniBand (IB) solutions. (See Broadcast Engineering, January 2010.) However, the fundamental impediment of traffic interference remains for both 802.3x- and IB-based clusters.

Our laboratory sought to address this with priority flow control (PFC). We performed a series of comparative tests between 802.3x- and PFC-enabled storage clusters. Ultimately, we found that PFC eliminates traffic interference and supports a highly scalable storage network that sustains 100-percent efficiency.

Media storage architectures

Figure 2. 802.3x DCB-based WARP cluster architecture

Figure 2. 802.3x DCB-based WARP cluster architecture
Select figure to enlarge.

Because media storage systems stripe data over several storage systems, every server needs parallel access to every storage system. Like classical IT storage area networks (SANs), most first-generation media file systems use a single Fibre Channel (FC) storage network to connect every file server node with every storage controller. This leads to a complex network topology that is ill-suited for media environments. VRT-medialab demonstrated that under sustained media storage traffic loads, the long traffic bursts interfere with each other in the switch buffers and create severe efficiency loss. (See Figure 1.)

As shown, when multiple sources deliver long bursts of traffic to the same destination, throughput of the source links is limited by the bandwidth of the aggregating link. However, when a second destination requests data from the same source storage controller (see purple traffic flow in Figure 1), the second destination server does not receive the full bandwidth available at the shared source link. Because the switch port buffers are filled with “blue” traffic, the purple flow can only pass a data frame every time a blue packet is read by the left destination server — a problem exacerbated by the fact that the left destination is reading from four sources simultaneously. Traffic interference occurs, and traffic flow to the second destination slows. Extrapolating this effect to larger media storage network topologies, efficiency severely deteriorates, limiting the scalability of any FC-based media storage environment.

DCB-based WARP cluster network

These limitations can be partially overcome by splitting the storage network into two separate networks. (See Figure 2.) This can be accomplished using IBM's General Parallel File System (GPFS) and a Workhorse Application Raw Power (WARP) media storage cluster consisting of storage cluster nodes and network-attached cluster nodes (NAN). This architecture has a much simpler topology.

DCB transport is well-suited as the cluster network for this type of media storage architecture. DCB allows flows to be tightly controlled and load-balanced over the links and uses the 802.3x PAUSE mechanism to provide link-level flow control similar to FC, creating a “lossless” environment. The result is a notable improvement in scalability and link bandwidth use compared with FC or even IB; however, the fundamental effects of traffic interference remain. (See Figure 3.)

As shown, when multiple NAN nodes read traffic from the storage nodes, each storage node responds with large bursts of media traffic toward each requesting NAN node (depicted as different colors). At the converged network adapter (CNA) network interface of the storage node, the bursts are queued in the network interface buffer. These frames are sent to the switch (shown here as a Cisco Nexus 5000), where they end up in a single ingress queue buffer. Because 802.3x PAUSE link flow control is configured, the link sends a PAUSE frame to the storage node once the high threshold of the buffer is reached, thereby avoiding frame loss.

In this example, three different NAN nodes are reading frames out of this buffer and also from the other storage nodes. This limits the total reading bandwidth on this port to only 75 percent of the incoming traffic throughput. Hence, the buffer fills up, and the PAUSE mechanism kicks in. If, because of the bursty nature of the traffic per flow, the filling of the switch port buffer is not equally distributed over the three different “colors,” one of the colors (or traffic flows) can be depleted by the simultaneously reading NAN nodes before the buffer reaches its low threshold and unpauses the link. When this happens, no frame from the depleted color is available, resulting in a “read-miss” of the NAN node and a drop in efficiency. The issue continues until the link is unpaused and frames of the missing color are again provided out of the network interface queue of the storage node. This efficiency loss can cause significant performance degradation in the network. Fortunately, there is a solution to this dilemma.

Priority flow control

DCB provides another, more advanced flow control mechanism: PFC. IEEE 802.1Q defines a tag that contains a three-bit priority field, allowing engineers to assign priorities to different Layer 2 traffic flows. With PFC, the network can be configured to pause traffic labeled with a specific priority (or “p-value”) independent of the other traffic. The mechanism works the same way as 802.3x PAUSE but selectively, per traffic class, instead of pausing the whole link at once. Effectively, each traffic class gains its own independent buffers and pause mechanism.




Want to use this article?
Click here for options!
Get Copyright Clearance

Share this article

blog comments powered by Disqus

 

Current Issue

Online captioning compliance

May 2012

The FCC has issued captioning requirements for all online video. Learn how to meet the requirements of the new rules and how to automate the technical process.

Read More articles...

Related Newsletter

Transition to Digital
A twice per month tutorial on digital technology.

Related Posts


Confused about the terminology in an article? Find definitions of common terms and abbreviations in Broadcast Engineering's Glossary.

 


Video Compression, Editing and Displays

Video Compression, Editing and Displays

Video compression, editing and displays is an in-depth tutorial on MPEG compression technology, editing MPEG content and evaluating color video monitors written by long-time video expert, trainer and writer Steve Mullen, Ph. D.

File Based Technology and Workflow

File Based Technology and Workflow

File-based technologies have replaced video tape methods for a majority of production and broadcast operations. The worlds of AV and IT are coalescing to create new methods and workflows for media

Sound Off Podcasts

 

Broadcast Engineering Digital Reference Guide

Browse Back Issues

Back to Top