Achieving high availability for video programming
Aug 1, 2009 12:00 PM, By Jim Metzler
Designing and deploying high-availability systems begins with selecting components that have been tested and offer proven reliability.
How to benefit from high-availability measurements
How can service providers achieve high availability (99.999 percent) on video services? The first step is to measure video programs 24/7 at key areas across the live system so per-program statistics can be collected. Next, compartmentalize availability into three key areas:
-
Program availability out of the headend or any video origin point;
-
Program availability across the wide area distribution system to each hub or drop site; and
-
Program availability on the last mile network, post QAM or DSLAM devices.
-
The headend encoder has three seconds of audio dropouts across the day;
-
The core network drops five packets in five separate seconds across the day; and
-
The QAM dropped a video PID for two seconds across the day.
The program availability for ESPN for a customer that day at the end is: PA = ((86,400 - (3+5+2))/86,400 ) * 100 = 99.988%. Ten seconds of impairment causes the program availability to drop to three nines. The SCTE draft “Recommended Practice for Monitoring” suggests that a program should have no more than six (HD) to 24 (SD) seconds of errors in a 24-hour period. This example would meet that criteria and be OK.
Service providers need to measure for this so they know what the availability of their service is. Many service providers have no idea how good or how bad their systems are. Frequently, OPEX is spent trying to improve system quality with no feedback mechanism as to how good the results are. Without compartmentalizing the measurement, there is no way to know which systems need improvement. Take the ESPN example:
-
The headend's program availability is 99.996 percent, and the issue is fixed by looking at audio in the encoder. (The specific fault isolation is key to improving systems, a benefit of simultaneous, live measurement.)
-
The core network's program availability is 99.994 percent.
-
The QAM's program availability is 99.997 percent.
Clearly, the program availability figure is cumulative; i.e., headend errors add to the errors of the network and both add to the QAM and down to the last mile into the home network and STB availability. (Errors that happen during the same second in multiple systems due to the same cause do not get counted twice.)
Consider another simple example: The link is lost in a single router at a headend servicing 250,000 homes carrying 300 live video programs, and the link return takes the router 25 seconds to return. The program availability for all 300 programs would be 99.971 percent, assuming there are no other errors for the rest of the day.
Delivering high availability begins before program delivery
High-availability capability of all of the components of a live video system is critical to delivering high-availability end-to-end service. Considering how many devices are in a system, how many software updates there are across the year and the number of new services being rolled out, how can a service provider be expected to deliver 99.999 percent availability or any other high availability figures, much less make improvements unless they have accurate measurements?
Service providers are not alone in needing these measurements. Equipment manufacturers, including encoder vendors, VOD vendors and router vendors, need to verify availability to account for the complexity of video. For example, some router manufacturers do not even test with live video at the volumes seen at the service provider. The first time some of these systems ever see thousands of live videos in 10Gb loads is at the service provider after being deployed. Manufacturers of such equipment components must be sure their test beds in QA test labs, engineering labs, manufacture and test labs include data and voice test loads, as well as live video in realistic volume. Also they must ensure it is measured in the same way service providers will do in monitoring deployed systems. They can also produce program availability reports as part of the hand off of equipment and software from the OEM to the service provider. If the equipment manufacturer can only achieve two nines (99.000 percent) or three nines (99.900 percent) under long-term, normal operation with configurations that mimic service provider networks, then the service provider will at least know that this is the best they can do with the system being deployed. They will also know what is causing the loss in availability so they can predetermine how to handle issues before customers call in with quality complaints, causing soaring OPEX. Or, given the test results, the provider may simply choose to deploy a device better suited for video service delivery.
If availability tests are completed in a lab environment, the results may be better than a field-deployed system subject to more unpredictable sources, physical interconnect stresses, environmental condition stresses such as temperature and humidity, power line transients, and human errors. This may cause the acceptable criteria as measured in these tests to be set somewhat higher than might otherwise be considered.
Of course, other tests should also be executed to complement these baseline tests before final equipment selection and deployment. Such tests would typically include, but are not limited to, the intended load levels; number, type, and speed of active ports; level and type of nonvideo converged traffic expected; forwarding protocols; and management protocols, as well as common voice and data tests that are expected in the operational environment.
Conclusion
Service providers are deploying continuous, real-time per-program quality assurance solutions for their IP distribution systems, which created the need for video device vendors to upgrade their testing suites. Delivering high-availability systems that handle the unique requirements of video payloads depends on each component of the system being up to the task but, in many cases, today's components have not been verified as being able to deliver the high availability needed, which dooms providers' goals before they get started. Testing new equipment for program availability with real video loads and with common operational impairments is needed by service providers who are under increasing competitive pressures to improve quality. Designing and deploying a high-availability delivery system begins with the selection of components that have a tested and proven high availability.
Jim Metzler is an independent telecom consultant for Metzler and Associates.
| Want to use this article? Click here for options! |





















