The video on demand (VOD) environment is one of the most complex networks in cable’s portfolio of services, in some ways being even more technically challenging than our voice over Internet protocol (VoIP) service.
In addition to VOD networks’ being complex to start with, they are also very dynamic because they have to be modified to accommodate heavy increases in content, many more customers who are active VOD users requiring attention for capacity management through the environment, and the relatively new focus on high definition (HD) content that takes up three to four times the storage and transport throughput as compared to standard definition (SD).
All these challenges are more than worth the effort because VOD is a significant revenue generator, whether used transactionally or for subscription VOD (SVOD), and is a truly compelling differentiator from cable’s direct broadcast satellite (DBS) competitors.
VOD differs from other data storage and transport systems in operators’ networks. In addition to the obvious engineering and operational support tasks of network planning, implementation, and operations, the industry has to pay attention to the quality and accuracy of the "data": video content, associated metadata, business rules for content use, ingestion, deletion and more. These areas were not as robustly focused on when VOD was a more nascent technology.
In this discussion, we will look at both the operational needs of VOD networks and also the requirements for video content maintenance to ensure optimal customer experience and business results from our VOD platforms.
Overview Figure 1 depicts the key elements of a typical VOD network, both hardware and software. "Typical" is emphasized because there are literally dozens if not hundreds of VOD permutations, but we will cover the core elements common in most environments.
• Pitcher: System (series of storage drives with networking and data management systems) that uploads content from the content provider (such as TVN, CMC, or inDemand) and delivers the content to the operator’s VOD complex, typically through a satellite link. Some content providers and operators are moving to fiber-optic transport as national backbones and regional rings become more prevalent.
• Catcher: System (large hard drive, hundreds of gigabytes with associated management and networking software) that receives "pitched" content and stores it until it can be ingested by the VOD system.
• VOD storage array/"server": Series of arrayed drives that store content. Capacity is typically measured in "storage hours" instead of disk space, with "hours" based on the operator’s expected mix of HD and SD content. As a point of reference, an MPEG-2 movie will need between 10 and 20 GB of storage depending on length, compression efficiency, and bandwidth cap. With most systems now having thousands of hours of content, you start with terabytes of storage and go from there. The system also includes very high speed networking outputs to be able to deliver hundreds or thousands of streams simultaneously.
• VOD back office: Control system including management rules for storage, file processing, scheduler, and a business rule engine that instructs the system on how to process and manage the content.
• Asset management system (AMS): Software that tracks the thousands of titles through their life-cycle on the system – ingest, storage, viewing window, display rules, data to set-top guide/carousel, viewing window closure, and content deletion.
• Video pump: Edge device that delivers aggregated streams to the network to be transported to the quadrature amplitude modulation (QAM) modulator for injection into the HFC plant. Network performance Many areas need to be carefully designed in the VOD network planning stages, and most if not all will need some surveillance and "care and feeding" through the operational cycle of the network.
Some of the key items in the ingest and storage areas include: ingest capacity and utilization to the catcher, catcher space utilization with thresholding to ensure it does not fill up, the input/output (I/O) from the catcher into the VOD storage array, storage array space, memory and central processing unit (CPU) utilization of the complex, and simultaneous stream output capacity and utilization (particularly important as HD content becomes more highly delivered).
On the "network-facing" side of the VOD complex, one needs to carefully design and monitor service group capacity (a group of QAM channels feeding one or more nodes); congestion/contention of service groups (this can vary widely within a particular system based on network design, demographics and new content launches); VOD network availability (uptime) exclusive of HFC interruptions; stream success ratio; planning and executing necessary change managements for growth, maintenance, or repair; and CPE-induced problems notably around non-responding set-top boxes (a key area of concern as you get closer to launching switched digital video, SDV).
With all these items, and, in fact, many more, how does one prioritize the key performance indices that most accurately reflect the customer experience, and ones that clearly point to areas on which to focus to optimize performance? Keeping it simple, the following three are good places to start.
VOD network availability: This metric is the foundation of the customer’s experience. If the network is not up, nothing can happen. No portal, no carousel, no guide, no viewing, nada. Fortunately, as technology has progressed, system stability and redundancy have improved, and as telemetry has allowed deeper supervision of VOD network components, outages have become fewer and briefer.
One key area that continues to trouble the uptime of the platform and the experience of the customer is network upgrades, whether for hardware or software, that often take longer than the brief few hours of the maintenance window. The cable operator community needs to continue to work with our VOD vendors to simplify upgrades so disruptions are minimized.
From a maintenance scheduling perspective, check the utilization data by hour to determine when the optimum time is to perform an upgrade or maintenance. VOD utilization typically continues at high levels later into the night as compared to normal video or high-speed Internet. So, for example, if a 30-minute outage must occur for a change, consider starting it later in the maintenance window, such as 3 a.m. instead of 1 a.m. But as always, allow plenty of contingency time for a longer duration or a backout of the procedure if necessary.
To calculate uptime, in addition to the engineering terminology of percent availability, I strongly recommend reporting the average downtime in minutes/month/customer as a metric that anyone from a customer service representative (CSR) to a general manager can relate to. A possible starting point is 99.95 percent availability, or about 22 minutes of average downtime per customer per month, depending on the performance and design of your network. More stable networks can target 10 minutes per month or less, but over a year, much better than that is tough because of an almost certain need for a multi-hour outage for some sort of upgrade or maintenance. If you have multiple sites, compare their performance and find out where your best practices and areas of opportunity are.
Service group congestion: Unlike the high-speed data network, when you run out of capacity at the entrance to the HFC network – the QAM resouces in the service group – you don’t get a relatively gentle slowdown or tiling or some other impairment. You get a big, fat "Network Busy" error, possibly prompting a customer phone call or expensive truck roll. It certainly frustrates customers and reduces their trust in the performance and reliability of your network.
VOD service groups typically have very high peak-to-average use ratios, with a three- or four-hour peak happening during prime time and much lower use the rest of the day. I have seen some folks report "average" congestion over their viewing day. Don’t do that! Average congestion gives a much more optimistic view of the customer experience than the reality.
A more realistic view is how often certain congestion thresholds are hit, say at the 50 percent, 70 percent, and 85 percent points. If you see frequent hits at 50 percent, it’s time to start aggressively watching for utilization upticks. If you get hits at 70 percent, it’s time to augment the network through a service group split, adding QAM resources to the service group, or launching a denser modulation scheme if you are already not maxed out with 256-QAM. If you hit 85 percent, it’s red alert time because one or two more concurrent streams will give your customers a "Network Busy" error.
VOD stream success ratio: This metric is probably the most useful of the three in defining the customer experience because it includes failures from outages or congestion as well as other networking errors. To accurately reflect the real experiences of your customers, it should be all-inclusive and should not attempt to remove failures precipitated by outages, whether planned or unplanned, or "average" errors over multiple service groups and platforms. In fact, the more granular, the better, with nirvana being the ability to report stream success ratio by service group by hour or half-hour. However, the VOD analytical tools available usually only aggregate to a server over a much longer period of time.
So what is a possible target? Around 96-98 percent is a reasonable target range, depending on the design of the VOD complex, size of service groups, condition of the HFC plant, percentage of non-responding set-tops, and the specific method of failure-reporting captured by the VOD analytics tool. Well-designed and operated systems should have no difficulty hitting 98 percent in normal performance, only occasionally dropping below that when an extended outage occurs because of either a system failure or downtime from maintenance activity. Just as with the service group congestion, set thresholds at a couple of levels so that you can tell when marginal parts of the network are about to go sub-standard. Performance scorecard Once you have baselined all your VOD systems, or compared their performance with those in another network, it’s easier to tell what’s going on. What really needs attention, and what is performing pretty well? Some of you may recall the approach from my network health discussion in CT‘s December 2006 article about creating a "scorecard" that uses well-known medical condition terminology to help target the worst performing areas. Figure 2 is just an example, and you probably have similar reports or scorecards within the vernacular of your organization that you can build from. The important elements are to "threshold" each key metric, weight the metrics according to network performance and desired targets, and then mathematically manipulate the metrics into a single "figure of merit" that provides a quick overview of the health of that part of the network. Conditional indicators that display increasingly severe colors as network health deteriorates are also helpful. Green, don’t worry about it; red, call out the firefighters. VOD analytics So where does all this performance data come from? In some cases, it may be built into the VOD back office or available as an additional software/hardware package. In other cases, you will need to deploy a third-party tool that will need edge-server capacity of some sort; perhaps you have underutilized servers for other operational support system (OSS) purposes. A couple of systems have built their own tools using data from simple network management protocol (SNMP) outputs of VOD equipment. However you get the performance data, it is crucial that you do get it, lest your customers become your OSS – not a good option in these days of high customer expectations and highly competitive markets. Content accuracy As noted earlier, at any given moment there may be 10,000 or more pieces of content ready for viewing on a current VOD server. I recently heard of one cable operator that is moving to as many as 30,000, with the less frequently viewed titles stored centrally. While this number is daunting, over a month there may be as many as two or three times the "immediately viewable" titles that cycle through the network. Some are loaded for days, some for weeks, some for months, and some maybe just for a night or two.
As these titles are transported through the network, sometimes errors occur: Titles are left off the "pitch" list by the content provider, errors over the satellite link cause file corruption, the catcher is full so content is missed, metadata is erroneous or not attached so the content cannot be tied into the business rules/asset management systems, and so forth.
What do you do? Perhaps you have a fully automated content reconciliation system that checks every piece of content every day against the "load" list, and if something is missing, the system works without intervention to download it before the viewing window opens. Not many – if any – of us are so fortunate as to have something like this.
More likely, some or all of this effort, if it is being done at all, is very manual and brute-force. For example, you might get a "should be loaded" spreadsheet from the content provider and a "what is loaded" spreadsheet from your asset management system. Then you "stare and compare," or if you’re lucky, one of your gearheads wrote a macro to spit out exceptions. But no matter how it happens, you end up having to send emails or make phone calls to get content re-pitched, perhaps having to pay re-pitch fees unless you can prove the "miss" was the content provider’s fault.
A couple dozen misses out 10,000 titles may not be bad statistically, but it will irk your customers and cost you money. If the miss is a popular title, such as episode three of year two of "The Sopranos," stand by for lots of angry customer calls. Even if the miss is a less-popular title, there’s a cost involved; as Table 1 shows, you could be leaving a lot of money on the table (so to speak). Missing the obscure titles in Table 1 could amount to nearly $100,000 of unrealized revenue in only one month. The real number is somewhat lower, perhaps 50 percent or so – someone at the system probably would have noticed that "27 Dresses" and maybe "Cloverfield" were missing, but the $406 earned from the "Bucket Head" content and revenue from many others would have simply vanished.
If you dig into your content accuracy, you will find not only that you can improve your VOD revenue, but also that you can determine – at least in some cases – why you have content problems. Then you can get them resolved internally or through your content providers, meaning a lot less brute-force work. Business results Optimizing the "care and feeding" of your VOD system has a number of beneficial outcomes for your business. Among them are:
• Greater customer satisfaction with the product and use of it driving behavioral expectations
• Fewer phone calls from outages, error codes, or missing/corrupted content
• Fewer truck rolls to persistent customers who "demand a tech" even though the problem is clearly not at their home
• Reduced churn because customers who frequently use VOD are less likely to go to a DBS provider
• Increased revenue
• Lowered operating expenses through improved content management – fewer re-pitch fees, fewer manual activities to resolve content problems
• Your marketing and product folks will love you. (One hopes this is positive ….)
• Better focus on the rest of the network – when the VOD environment is well-managed, it is much harder to confuse other network problems as a potential contributor to VOD issues
Watch your VOD environment closely, address your poorest performing areas first, identify and promulgate best practices, ensure the accuracy of your content, and you can sit back and relish happy customers, a great revenue stream for your business, and a stable network that delivers excellent performance.
Keith R. Hayes is vice president, network operations and engineering services, for Charter Communications. Reach him at Keith.Hayes@chartercom.com.