Bullpen: Troubleshoot Video with Standards
Tracking down and isolating the root cause of poor subscriber quality of service (QoS) and quality of experience (QoE) in IP Video distribution networks is well served by a well-established specification: European Telecommunications Standards Institute (ETSI) technical report (TR) 101 290.
The Digital Video Broadcasting (DVB) project published TR 101 290 through ETSI in 2001. It details how to test MPEG-2 transport stream (TS) and includes more than 50 MPEG-2 TS measurements.
(Note: MPEG-2 TS, a transport layer protocol, differs from the independent algorithm for MPEG-2 video compression. The purpose of the TS protocol is to multiplex streaming compressed video and audio with an electronic program guide over a single logical stream that includes synchronization information.)
"The MPEG-2 TS protocol headers enable measurements such as packet loss, jitter and ‘zap’ time."
TR 101 290 splits the 50 into three different "priorities" depending on how drastically they affect the content stream. All of the first and second priority measurements, around 20 in total, apply to every MPEG-2 TS implementation regardless of the overarching broadcast standard. And about 10 of 30 third priority measurements also are widely applicable.
The specification has been universally accepted. Test solutions following it range from software-based point solutions using standard PCs and laptops with off-the-shelf network interface cards (NICs) to hardware-based probe solutions and comprehensive end-to-end monitoring systems.
Compression basics
Troubleshooting MPEG-TS requires a basic understanding of how digital video compression works, so here’s a refresher.
The building blocks are complete images are called "Intra-Frames" or "I-Frames," which are very similar to JPEG images taken by a digital camera. Video subscribers should receive such complete images at least two times per second, because the I-Frame rate determines how quickly a customer can recover from a single, visibly detectable error. For a moving picture, the rate should be 30 frames per second.
Video compression algorithms divide each original I-Frame up into "macroblocks," or a grid of about 500 squares. Accompanying those 14 or so video frames that occur between each set of I-Frames is very select information about each macroblock, containing the changes in brightness and color and any needed motion vectors.
These are the smaller (75 percent) partial frames, called "Predicted (P)-Frames," and the even smaller (90 percent) very partial frame, called "Bidirectional (B)-Frames." The fundamental idea is that most of the image usually either remains the same or many blocks move with identical motion vectors and thus do not need to be retransmitted. The pattern of each series of frames that occurs between I-Frames is referred to as a "Group of Pictures" or a "GOP."
Check the source
The MPEG-2 TS protocol headers, which are never scrambled, enable important measurements, such as packet loss, packet jitter and channel change or "zap" time.
The header has a packet counter that detects dropped packets; it has a highly accurate 27 MHz clock that calculates jitter; and it has core program specific information (PSI) tables that allow a decoder to de-multiplex various individual content streams, such as the video, audio and closed-captioning.
It is best to start at the content source and then work through the transmission network to the customer premise. If the source digital video is bad, there is no way to correct the error. A bad camera or encoder in your supplier’s location will pass their problem on to the customer, who will blame the service provider.
Original content monitoring is the place to verify basic video quality metrics, such as over-compression, under-compression, pixelization, tiling, frozen video, missing audio tracks, and poor audio/video synchronization. It helps to get as detailed here as possible by examining the following:
-
Original compression bit rate, compared to the transmitted bit rate
-
Group of Picture (GOP) pattern — especially the I-Frame rate
-
Various synchronization timestamps
-
Quantization matrices, macro-blocks and motion vectors
Everything looked at here is one less thing to worry about downstream. But note: only unencrypted video streams can be analyzed this way. Once the digital rights management (DRM) system kicks in, compressed video information becomes scrambled.
Thereafter, an operator needs to need to monitor video at different points in the transmission network to pinpoint quickly and effectively the sources of network-induced impairments.
Loss, jitter, "zap" time
One real-word scenario where a systematic, standards-based approach proved successful in the transmission network involved a network with a primary and a backup switch. Whenever the primary switch got overloaded (typically during routing updates) some of its traffic was offloaded to the backup switch.
Generally this scheme worked fine until the load on the backup switch exceeded 400 Mbps, at which point it started dropping packets. When the lost packet is part of a B-Frame, the error is barely visible and is quickly corrected by the very next P-Frame. When the lost packet is part of an I-Frame, not only is the error more visible, but it will also remain visible until the next I-Frame is received and the error is finally corrected.
Packet jitter is a fact of life, and depending upon network specifics, can cause packet loss.
Every video stream is going to have inter-arrival jitter introduced as it travels through a transmission network. Some video equipment will begin having problems displaying video with as little as 10 ms of jitter, and most video equipment will have problems by the time you have 20 ms of introduced jitter.
Packet inter-arrival jitter is important because it impacts the buffering requirements for all downstream network and video devices, and extreme jitter can lead to anything from lip-sync problems to the loss of packets because of buffer overflow or underflow.
PSI table rates can also cause trouble for video in an IP network, as they affect channel change times.
PSI tables provide the basic channel demuxing and EPG information. Subscribers cannot change to a new channel until the encoder sends the new channel’s tables.
In another real-world example, subscribers accustomed to speedy channel surfing were complaining about how long it took to change the channel. A troubleshooting exercise revealed that the set-top box was taking 2 to 3 seconds to issue each channel change request. Once the request was issued, the network responded in less than 400 ms with the new channel.
Having problems with customer premises equipment (CPE) is operationally expensive. These devices should be verified not only to stated specifications but also to the requirements of a video-over-IP network. Monitoring video quality across an entire transmission network enables one to know — absolutely and positively — that a problem exists at the customer premises before ever rolling a truck.
Following a standards-based approach to troubleshooting, from the content provider to the customer, is the best way to exceed customers’ expectations for QoE.
Francis Edgington is vice president, HEYS Professional Services, where he serves a number of clients. In his work on digital video monitoring, he has collaborated with Trilithic.