Flash storage ensures reliability
Nov 1, 2010 12:00 PM, By Stephane Jauroyou
With solid-state drives, broadcasters get both high performance and high reliability.
Figure 1. A central production store minimizes file transfer operations, especially if content is edited in place on the storage system.
Select figure to enlarge.
Disk reliability has long been an issue for broadcasters. When disk failures occur in play-to-air servers — the most critical part of the on-air infrastructure — it can mean going “black to air” for millions of viewers. Even with the redundancy of mirrored or parity-protected configurations, broadcast engineers must still wait for disks to rebuild while hoping the rest of the system stays intact. Now, in a new study by Carnegie Mellon University, “Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you?,” researchers Bianca Schroeder and Garth A. Gibson confirm what broadcasters have suspected: Disk drives fail at rates six times higher than those reported by vendors.
Things will only get worse with HD. Moving from 15Mb/s SD to 50Mb/s HD, the same TV show will take up to three times more storage capacity. Three times as many disks will make it three times more likely that a disk-based server will fail.
To reduce failure rates, vendors have tried replacing disk with solid-state solutions, i.e., devices with no moving parts, which is where almost all disk failures occur. Flash is an obvious candidate because, like disk, it holds data after the power goes off. Flash also consumes less power than disk, produces less heat and is quiet by comparison.
This is the logic behind flash-assisted storage technology, server clustering with flash modules replacing disk drives on all play-to-air servers. Due to unique data striping, over all nodes in a cluster and all modules in a node, both I/O and reliability are extremely high. The solution's managed reads and writes optimize performance and avoid write hot spots that can exhaust flash prematurely. Flash-assisted storage is also economical because it combines a disk-based nearline cluster for ingest with a flash-based play-to-air cluster for high availability.
Disk's inherent risk
Broadcasters' demands for a better way to store content follow years of dissatisfaction with what many see as high disk failure rates. The Carnegie Mellon University study has confirmed that disks do suffer high failure rates.
Using vendor return merchandise authorization (RMA) data, the researchers measured actual disk failure rates in the field against two key benchmarks: annual failure rate (AFR) and mean time to failure (MTTF). Among the findings:
Disk AFRs typically exceed 1 percent, with 2 to 4 percent common and up to 13 percent on some systems.
Field replacement rates of systems are significantly higher than expected based on data sheet MTTFs (by two to 10 times for drives less than five years old).
The rate at which disk drives fail rises steadily throughout their lifetimes, starting as early as the second year of operation, rather than holding flat as is widely expected.
By comparison, the AFR for flash drives (also from RMA data) is just .04 percent — an improvement of 100 times. Broadcasters should therefore expect to replace flash drives far less often than disks. That is because total risk of system failure due to a disk failing equals the risk of one drive failing multiplied by the number of drives. In other words, it would take at least 100 flash drives to have a combined risk equal to just one disk drive.
Some risk-mitigation strategies, such as disk mirroring (RAID 1) and disk rebuilding (RAID 5/6), simply address the problem by adding more disks. These strategies do not address the disk's underlying inherent risk. One result is that these rebuilding servers are in a “degraded state,” where a second disk failure may take out the entire on-air operation. Another is that the strategy itself may not work, particularly because a RAID rebuild assumes 100 percent data integrity on all remaining “good” disks in the server, which is not always a safe assumption. And even when these strategies do work, there is no compensating value (like faster I/O) to offset the costs of adding the extra disks.
High failure rates pose a significant challenge in the SD environment, and even more so as broadcasters move to HD. HD requires three to 10 times as many disk drives as SD to provide HD bandwidth and store the same number of program hours. The likelihood of a play-to-air storage failure will increase in proportion to the number of drives added and the age of the drives. If one server has an AFR of 25 percent, then a mirrored configuration's AFR is slightly less than 1 percent, assuming there is 48 hours mean time between repairs. To achieve the five-nines availability (99.999 percent) broadcasters expect, the AFR must be less than .25 percent — a near impossibility in light of recent research.
Flash's inherent risk
The advantage of flash drives is that they start with a much lower AFR per unit, less than .04 percent. Table 1 shows how that low flash failure rate translates to a low annual cluster failure rate (.23 percent), even without RAID 5 protection on the chassis level. Clusters ranging from three to nine nodes show five-nines reliability, with nine nodes being the worst case. A nine-node cluster holds about 10TB of data on 24 flash memory cards, each of which holds 64GB.
The total reliability of the cluster equals the sum of the subsystem failure rates, which in each subsystem equals the failure rate of each component multiplied by the number of those components (minus any redundancies). The subsystems within each node are the motherboard (one), GigE I/O cards (three), flash memory drives (24), power supplies (1 + 1 redundant) and fans (5 + 1 redundant). Component failure data is based on either RMA data or MTTF reported by vendors.
In an example of a flash-equipped broadcast server, availability of each server node is 99.873 percent. However, reliability of the worst-case, nine-node cluster as a whole is still 99.999 percent — a difference that is directly attributable to the use of flash memory. (See Table 1 on page 14.)
Clustering for flash-assisted storage
Flash-assisted storage is based on clustering, a technology proven in TV operations since 1996, except that flash memory drives replace disks. The key insight is to carefully manage reads and writes to each flash drive so its performance is optimized. Media content data are striped contiguously in two ways: across all drives in a server and also across all servers arrayed as nodes in a cluster. All content has equal and parallel access to I/O, so high ingest bandwidth is achieved while maintaining full playout performance.
With up to 24 64GB flash memory drives in each node, a nine-node flash cluster can scale up to more than 500 hours of HD content at 35Mb/s XDCAM video (or 50Mb/s video and audio combined).
Single-copy flash memory storage is shared among all the nodes in a cluster with N + 1 redundancy, which avoids costly mirroring. And because all data, including parity data, is evenly distributed, no dedicated parity drives are needed. Service continuity is protected even if a node fails, or during in-service maintenance, hot-swapping of drives, system upgrades or installation of additional base nodes.
Clustering also solves the problem of flash write hot spots, when writes repeatedly hit the same flash memory causing it to degrade quickly. Clustering eliminates write hot spots by evenly load-balancing writes across all flash modules so the same memory location may only see a few writes per day, if that. This extends the memory's lifetime to more than 10 years, versus the typical five years for hard disk drives.
Continue on next page
| Want to use this article? Click here for options! |





















