May 1, 2009
Paradigm Shifts: Remote Storage DVR
From Legal Challenges to Technical Challenges
By Glen Hardin, Time Warner Cable
Since the U.S. Court of Appeals for the Second Circuit ruling in August 2008 that gave Cablevision's remote storage digital video recorder (RS-DVR) service the green light, industry experts have been assessing the impact and wondering what it really means to the consumer, the cable industry and to the technical infrastructure that delivers these new services.
This article aims not to discuss the ruling but focuses instead on the slippery slope to where, if it stands, the technology may evolve and adapt.
As understood, the ruling calls for an architecture that enables each customer-requested program to be uniquely recorded onto the video server and stored separately so that one customer's programs cannot be accessed by any other user; it also stipulates that for playback, the user's unique content must be individually streamed to a set-top box within the customer's home.
To illustrate, if 10 households request to record the same program, 10 individual recordings would occur and be ingested into the video server; the file would be written to 10 unique and separate storage locations within the video server; and when the content is played back, a unique asset is streamed to the customer. In other words, content is not shared across ingest, storage and streaming. It all must be unique.
For recording, a single ingest cannot be replicated or copied. That is to say, the system cannot ingest a single asset and then make as many copies as there are user requests. Conversely, for playback, a single instance of the asset cannot be loaded into cache of the video server and then streamed to multiple users.
To validate or audit the implementation, there would be three basic checkpoints to ensure content integrity and compliance throughout the system:
• Per recording requests, a unique ingest occurs onto the video server
• A unique file is stored on the video server per recorded request
• A unique asset streamed off of the video server per customer playback request
But even within this constrained interpreted definition, there is tremendous latitude for interpretation that may forever change cable's future. The primary concepts to explore are content "fair use" precedents, virtualization of "tuners," abstraction of the "network," the definition of a "trusted" device, and the examination of the video on demand (VOD) server architecture required to deliver the RS-DVR service.
Content and fair use
The adage "content is king" proves true once again as the RS-DVR ruling extends the definition of fair use of content.
Advancements in technology have continually forced the examination of the exercise of this concept. VCRs, where consumers were first able to record video programming, helped mold some of the original definitions of "fair use." TiVo boxes and DVRs with integrated guide and schedule extended, simplified and conveniently packaged the broadcast recording capability for mainstream adoption, extending the fair use implementation. Newer technologies like Slingboxes, where a consumer's content can be viewed on or off the cable plant, again pushed the boundaries of the definition.
RS-DVR is just the latest technology that challenges the current paradigm of fair use. Coupling RS-DVR technologies with other influential fair use technologies could significantly change how things are done today.
To tell the virtualization story, let's first place it in current context and then extend it to the new architecture of RS-DVR.
DVRs consist of a number of tuners used for receiving and recording content to the hard drive and a separate native playback tuner to stream content from the hard drive. Today, a DVR's recording and playback capability is limited by the number of tuners within the DVR device. That is, a two-tuner DVR can record two unique programs and play back another at the same time.
Next generation DVRs utilizing Multimedia over Coax Alliance technology may enable all the tuners within a home to be virtualized for both recording and playback, termed "whole house DVR." (For example, the viewer could record a program on the living room DVR and watch the program in the bedroom.) This means that any tuner within the household can be used to tune and record a program and then use the MoCA tuner to receive content streams from any of the hard drives within the household. In this solution, the recording capability is limited to the number of tuners within the household, and the playback is limited to the number of MoCA tuners plus the number of native playback tuners on the DVRs.
In an RS-DVR architecture, the individual recording is actually taking place on the VOD server and not in a DVR, and thus the recording capability is not artificially limited by the number of tuners within a single DVR or within a household, but is now limited only by the real-time ingest capacity of the VOD server. The recording tuners effectively become "virtualized" into the VOD server architecture. That allows a user to simultaneously record and uniquely store - limited only by the capabilities of the VOD server - an infinite number of programs.
Playback from the VOD server is not limited by the DVRs within the home, but by the existing VOD narrowcast throughput and the number of set-top boxes (not tuners) within the home. Specifically, once a program is recorded onto the VOD server, any and all set-tops within the home can simultaneously watch the playback of the program, but only one playback stream per set-top.
The key point here is that RS-DVR allows for infinite simultaneous recording capabilities per user and an infinite amount of storage capacity per user. Both are limited only by the current capabilities of the VOD server infrastructure or artificially through business rules.
Upon implementation, however, it will be limited in practice; "infinite" is infinitely impractical to build.
While virtualization of tuners is a huge conceptual leap from where the technology is today, the technology argument can be extended even further.
Let's consider these propositions: A single DVR's network is limited to the actual physical device; the whole-house DVR is limited to the MoCA network within the home; and RS-DVR is limited only to VOD narrowcast. Is all that really true?
The physical VOD server actually is abstracted within the infrastructure by a gigabit Ethernet (GigE) network that allows the server to stream video from "anywhere" to "anywhere" via user datagram protocol (UDP). And while today the stream delivery is through the VOD narrowcast, it is not really constrained to it.
Logically following, the RS-DVR playback need not be limited to the VOD narrowcast. It could just as easily stream UDP video through high-speed data narrowcast. UDP delivery is UDP delivery, and narrowcast is narrowcast. (Granted, high-speed data cable modem termination system-CMTS-narrowcast is more expensive and less efficient, but let's leave economics aside for the moment.)
Now that the playback of content is not limited to the device or the home network, it should not be limited to networks within the cable operator's footprint, but could easily be extended to the Internet as a whole. This capability under fair use is already granted as the de facto precedent and is available today with the advent of Slingboxes that allow users to view "their" cable content off the cable network and across the Internet.
Cable's programming requirement to broadcast to "trusted" devices has not been forsaken, but is now more open to interpretation as to what constitutes trusted devices.
Today, set-tops are considered trusted devices, and they achieve that designation by virtue of their security mechanisms for authorization and decryption. But as noted previously, with the implementation of Slingboxes, there is both a hardware and a software component to the fair use implementation of the technology.
The Slingbox hardware encodes the video output of the set-top or DVR and routes the video across the network to be received and decoded by the trusted Slingbox software client running on a computer. To be competitive in the marketplace, cable will have to offer trusted devices as well, whether they be hardware, software or a combination of both that work on or off the cable network.
These devices could be PCs, cell phones, PDAs, etc. It is all about the consumer's fair use of the content and not the exact playback device, as long as it is trusted.
VOD server evolution
Let's review. The user has infinite recording and storage capabilities, and the fair use playback of content is not limited to the home network or just to the cable network, but could be to the Internet as well, as long as it is delivered discretely to trusted devices. The question now becomes the following: What server architecture is required to deliver content in this new paradigm?
The basic video server would still be centric to the new content delivery paradigm, but it would need to undergo a major transformation. VOD servers originally were optimized for "reading" from the disk input/output (I/O) subsystem to stream to the end customer; "writing" to the disk I/O subsystem was limited to allow content to trickle in at best effort.
This architecture has served traditional VOD very well for many years. With the growth and acceptance of VOD, the server architecture evolved to separate "streaming from storage" to allow independent growth.
In the early days of VOD, the primary challenges were to stream and store video. Streaming was originally via digital video broadcast-asynchronous serial interface (DVB-ASI) or quadrature amplitude modulation (QAM) RF. The advent and adoption of GigE greatly simplified and lowered the cost of solving the streaming problem. Storage itself has two basic complexities: (1) storing large amounts of video in a resilient manner; and (2) performance issues for I/O or reading and writing to the storage sub-system. But storage was expensive, not very dense and handled exclusively via hard drives. To maximize the storage and I/O efficiencies of this expensive sub-system, broad-stripe redundant array of inexpensive disks (RAID) technologies were developed to eke out every possible performance gain from the storage subsystem.
Regarding the I/O, however, there is an inherent performance tradeoff between writing content to the hard drive and streaming from the hard drive. A good rule of thumb is that it takes 20 times the disk I/O capabilities to write to the drive than to read from it. In the early days, VOD server architectures were optimized to stream in deference to ingest content, and that sufficed with original VOD requirements. The VOD server solutions originally deployed tightly coupled streaming and storage to optimize their varied solutions.
As the on-demand offering became much more diversified with subscription VOD (SVOD) and the addition of other categories of content, the need became apparent to grow streaming independently from storage, and the inverse, storage from streaming.
The industry also by now had years of experience and content usage data to draw upon for new server architectures. To that end, caching servers or caching within the VOD servers emerged to optimize hardware performance. As before, VOD server designers optimized the I/O efficiencies and maximized the capabilities of these expensive disk sub-systems, but added a cache storage layer to gain the performance benefit of the content utilization pattern.
With adoption of real time acquisition (RTA) services such as StartOver, the original server designs were forced to enhance their writing capacity to the disk I/O subsystem to support large amounts of video ingest without significant tradeoffs in reading capacity. In the case of StartOver, a single ingest stream is all that is required to serve all consumers of that service. At the same time, these servers were required to stream the ingesting content out of the video server with a delay of less than 10 seconds "turnaround time" (time to ingest into the video server and stream out to serve consumers).
These were hard problems, but caching servers mitigated and optimized the disk I/O writing and reading loads using their subsystems to cache both the ingesting and streaming content. Additionally, because StartOver content is transitory, it aligns directly with the benefits of the architectures of caching servers.
RS-DVR video server
The RS-DVR video server is not unlike the original caching and RTA video servers at their most basic levels, but is very different because of the implementation requirements mentioned at the beginning of this article and the compounding complexity of scaling this same implementation.
The best way to identify the performance of an RS-DVR is by comparing and contrasting the performance requirements between the original VOD server and the RS-DVR server in terms of ingest and streaming, while keeping in mind the RS-DVR cannot use caching for ingest and streaming.
The original VOD server's content ingest requirements were minimal. Historically, the ingest requirements were designed to handle the top 100 VOD hit titles, and the best-effort "trickling in" of content sufficed to support that minimal content library.
As additional services like SVOD were added, the ingest requirement increased, and this increased the trickle into a constant flow, but for the most part, this worked within the constraints of existing servers. With the adoption of RTA services, the existing trickle ingest capabilities were no longer adequate to support the RTA requirement of less than 10 second turnaround times and the quantity of ingest channels (around 100 today).
The current RTA ingest requirements pale in comparison to the possible unlimited ingest requirements of an RS-DVR server, where there could be an infinite number of simultaneously ingested programs. Additionally, whereas a single asset previously could be ingested and then replicated and copied across the various servers that comprised the VOD server complex, the RS-DVR server may only buffer incoming programs to the same degree as DVRs do today, around 2 seconds, but it cannot replicate a single ingest. Hence, there is no cache gain on content ingest because each recording must be discrete.
To accomplish the infinite ingest requirement, the RS-DVR server is, at some level, solely reliant on raw disk write capabilities and capacities of the disk I/O subsystem. To improve performance, a server might write an incoming asset to solid state storage and move it to the disk subsystem as the disk subsystem has writing capacity to store it long-term and for playback.
In the original VOD server (without caching), the streaming performance of the server was directly tied to the read capacity of the disk I/O subsystem, where streaming a unique piece of content for every possible stream is what validated performance.
This one-for-one performance test defined the maximum server streaming capacity, effectively allowing for infinite content streaming. Caching servers further optimized and extended the streaming performance by placing blocks of high-use content in high performing components like random access memory (RAM) or solid state media and streamed those content blocks to as many users as the server's streaming "anywhere to anywhere" capabilities permitted.
The RS-DVR cannot cache or stream to multiple users from the same content source. Accordingly, its server performance is equal to the original VOD server's performance, but in the case of RS-DVR, the total number of streams is limited not by the content but by the number of set-tops or trusted devices retrieving recorded content for playback.
It should be noted that not all recordings would be instantly streamed. Some would go directly to the storage subsystem. Only a few would be instantly "turned around" for streaming, and that is limited by the number of trusted devices within a given account. That is to say, while a subscriber can schedule an infinite number of recordings, the number of trusted devices within the account limits the number of simultaneous streams. Furthermore, not all streams would be watched at the same time they are being recorded. This may reduce the total turnaround from infinite to a level based on numbers of trusted devices accessing the service.
The difference between the two servers' performance requirements can best be demonstrated by comparing and contrasting the server performance against a unit of storage. In this example, the unit of storage will be arbitrarily set at 10 terabytes (TB), and the performance will be evaluated in terms of ingest and streaming.
The original VOD server with 10 TB of storage would try to maximize the number of streams served by that 10 TB of storage: in this example, 1,000 streams with each one being a unique piece of content. But in a RS-DVR server, this same 10 TB of storage may be dedicated for a single household, and thus its streaming requirement is equal to the number of set-tops within the household, with the average being less than three set-tops per home. However, ingest requirements are very different. The original VOD server needs only to support trickle ingest to accommodate that service while the RS-DVR must support infinite ingest (based on user requests and limited business rules).
Thus, the basic server equation would look like this:
VOD server = 10 TB storage + trickle ingest + 1,000 streams (infinite streaming)
RS-DVR server = 10 TB storage + infinite ingest + 3 streams (trickle streaming)
When the magnitudes of ingest and streaming are compared and contrasted between the two servers, it is as if the server architecture requirements have come full circle, but inverted. The original VOD server required trickle ingest while the comparable RS-DVR requires trickle streaming; but while the VOD server required infinite streaming of content, the RS-DVR by contrast requires infinite ingest of content.
Does it exist?
This begs the question: Does an RS-DVR server exist today?
While caching servers are ideally architected and optimized for VOD and RTA, the implementation rules of RS-DVR nullify the cache gain for ingest and streaming for caching servers. The original VOD server relied on "brute force" disk I/O performance to attain the necessary reading capacity for streaming. In like though inverse manner, existing server technologies can apply the same basic principles via brute force to attain the necessary disk I/O performance to solve a limited but not infinite writing-capacity problem to serve as an RS-DVR server.
The brute force method provides some semblance of a restricted RS-DVR service, but the full potential, as envisioned in this article, has yet to be realized. As of today, there is not an elegant or optimum on-demand server implementation that provides a boundless RS-DVR service.
To be sure, the RS-DVR server represents a new chapter in the evolution of the on-demand server. And in this new chapter, the guiding design principles will be based more on the evolution of the definition of fair use than on the coupling of the latest and greatest available hardware and software innovations to adapt and evolve the technology to address the infinites of ingest, storage, and streaming across multiple networks to a plethora of new trusted devices.
Glen Hardin is chief architect, video systems, for Time Warner Cable.