Plan with a STEADY hand

Oct 1, 2006 12:00 PM, BY MICHAEL WRIGHT AND BRIAN REDMOND

Even a small change to one piece of equipment could ignite a chain reaction down the system line.

    

Ready to go? Not quite!

Figure 1. This diagram of a sample system includes ingest and decoder ports, class editors, newsroom browser and edit stations (both typically low-resolution proxies), newsroom automation systems, metadata database server, proxy server and storage, and SAN. All elements in a system must be considered before any upgrade or change is made.
Click image to enlarge.

Now that we've peeled back the layers, it's clear that this simple upgrade touches critical components across the core of your system. If any part of the upgrade runs into a problem, it's possible that the database or even the entire online content could be corrupted. No wonder the chief engineer and operations managers are losing sleep!

Minimizing risk is no simple task, but it is far easier than recovering from a disaster such as total wipe out of database or online content. The first step is to make a prioritized list of all the processes affected by the upgrade. That makes it easier to ensure that processes that have to be changed before others will be scheduled at the right point in the upgrade.

Second, further break down the list by detailing the step-by-step task for each of the processes. Make sure that each task within each process is scheduled where it needs to be in the upgrade schedule.

Let's assume that every product category within the broadcast network requires an upgrade. Therefore, every device in the system needs to be placed on the task list. It's important to keep in mind that even though the upgrade process is the same for like devices, each unit needs to be in the task list to ensure proper time allocation.

Even in a relatively simple upgrade, there is a lot to do, and performing the upgrade will mean taking the entire system offline for some period of time. Typically, this requires working through the night when the system workload is light. And, of course, the system absolutely has to be back up and running reliably by the predetermined deadline.

Now that there is a prioritized task list of every step, it is time to take another look for potential risks throughout the upgrade process. Can the list be further broken down into essential tasks? If so, continue to revise the task list until every step is clearly defined and in the right order.

The point of no return

Now, it's time to go back through the list, identify the key critical points and add tasks for test and verification along the way. It's also important to add contingencies to handle potential issues that may be revealed through testing and verification. Also, evaluate the entire process and identify the point of no return — the point at which you must decide whether to halt or continue with the upgrade.

The plan also needs to include time to back up essential data such as the metadata database. If there's even a possibility that other data, such as online storage, might be corrupted during the process, build time into the process for backup of that data, too. Backup needs to be scheduled and completed immediately prior to taking the system offline. It is far better — and less disruptive — to back up data than to deal with the loss of key data after a system fails.

The shakedown

As part of contingency planning, allow time for a worst-case scenario that would require porting the backup data back onto the system. Above all, build in test cycles at all appropriate points in the process so that you do not proceed to the next interdependent step or task until you are sure that the new component is doing its job properly.

It won't be known for sure if the entire system will work properly until the completion of all the interdependent upgrades. However, testing and verifying along the way, offers more assurance that the upgrade should come together and only require clean-up tweaking to operate correctly.

One last point

Because there are multiple system components, such as editors, ingest ports and decoders, save time by upgrading and testing several of these before bringing up the core of the system. If you don't find any issues after testing a few of each of the peripheral devices, proceed with upgrading the remaining components and move forward with core system testing.

Earlier, we established a point of no return. This is a very important milestone in the upgrade process, particularly if trouble happens along the way. For example, what happens if, after adding the required RAM upgrade for the SAN server, the server will not boot? What if the added RAM does not pair well with the RAM already in the server? The first thought is to scramble for additional RAM, but the clock is ticking away. Is there time?

If, while trying to troubleshoot this issue, the predetermined point of no return arrives, it's time to revert to the original configuration and schedule a new time for the upgrade — with the needed RAM ready and at hand. Do not make the mistake of going forward. To do so could spell disaster for your system — missed deadlines or, even worse, a crippled, inoperable system.

While the balance of the peripheral equipment is being upgraded, it's time to start a series of end-to-end tests and the shakedown of the core system components. Once every component upgrade has been tested individually, it is a good practice to perform a complete system worst-case load test to ensure that the upgrade process hasn't introduced restrictions in your system's capacity.

If the system performs as it should, the next step is to hand the system to operations to test again. If no major issues are revealed at this point, the system can be handed back over and put back online.

During the test cycles, confirm that the issues listed as fixed in the new release really have been fixed. If not, record discrepancies and report them so they can be addressed in a future release. Often, as an issue is resolved in a new release, it reveals other issues — hopefully issues that are less critical than the ones the release corrected.

It bears repeating that careful planning for each step in the upgrade transition is really the only way to proceed to protect the systems and to make the transition as trouble-free as possible. So, be careful, plan wisely, and mitigate risk by seeking help from experts who have been through the process before.


Michael Wright is president of IT Broadcast Solutions Group, and Brian Redmond is vice president of Broadcast Consulting Services.




Want to use this article?
Click here for options!
Get Copyright Clearance

Share this article

blog comments powered by Disqus

 

Current Issue

Online captioning compliance

May 2012

The FCC has issued captioning requirements for all online video. Learn how to meet the requirements of the new rules and how to automate the technical process.

Read More articles...

Related Newsletter

Transition to Digital
A twice per month tutorial on digital technology.

Confused about the terminology in an article? Find definitions of common terms and abbreviations in Broadcast Engineering's Glossary.

 


Video Compression, Editing and Displays

Video Compression, Editing and Displays

Video compression, editing and displays is an in-depth tutorial on MPEG compression technology, editing MPEG content and evaluating color video monitors written by long-time video expert, trainer and writer Steve Mullen, Ph. D.

File Based Technology and Workflow

File Based Technology and Workflow

File-based technologies have replaced video tape methods for a majority of production and broadcast operations. The worlds of AV and IT are coalescing to create new methods and workflows for media

Sound Off Podcasts

 

Broadcast Engineering Digital Reference Guide

Browse Back Issues

Back to Top